Hacker Newsnew | past | comments | ask | show | jobs | submit | srean's commentslogin

Except that their success in the time series domain has been rather lackluster and elusive. It will s one of the few domains where old school models are not only less work to maintain but also more accurate. There are a few exceptions here and there. Every year there are a few neural nets based challengers. You can follow the M series of computations from its start to see this evolution.

Maybe because useful time-series modeling is usually really about causal modeling? My understanding is that mediated causality in particular is still very difficult, where adding extra hops in the middle takes CoT performance from like 90% to 10%.

Yes causal models are hard.

NNs do ok on those time series problems where it is really about learning a function directly off time. This is nonlinear regression where time is just another input variable.

Cases where one has to adjust for temporaly correlated errors, those seem to be harder for NNs. BTW I am talking about accuracies beyond what a typical RNN variants will achieve, which is pretty respectable. It's the case that more complicated DNNs don't seem to do much better inspite of their significant model complexity.


LightGBM won M5 and it wasn't even a competition.

The task was slightly different and favored GBMs. Note they aren't NNs whose underwhelming performance was what my comment was about.

The M series of competitions change the tasks every year to explore what models perform best under different scenarios. As I mentioned, neural network based models win here and there, but very spotty performance over all.


Yes Thorp had secured a hard to schedule meeting with Shannon regarding his research. Turned out Shannon was more interested in the analysis of a few gambling games that Thorp had thrown in the conversation.

The wide interest in things that Shannon held from weird gadgets to possibly the most famous Masters thesis dissertation, Shannon has me in awe and respect. Had he been a good storyteller, I suspect, people would have been as familiar with his name as Feynman.


Curious, at what price per barrel does US oil fields get profitable? For their own domestic consumption they don't really need the Irani oil do they ? It seems to be the case that it's rest of the world that needs the oil and US needs the rest of the world to not be pissed at the US.

> at what price per barrel does US oil fields get profitable?

$30 to 70 per barrel [1]. (Pretty much all production is profitable above $100/barrel.)

> they don't really need the Irani oil do they ?

Our refineries can't process our own crude. So we export crude and import refined products.

That said, yes, the oil exports do blunt the net effect of the blow. If pressure really rose, one could tax the excess profits to directly reduce gas prices.

[1] https://www.opxai.com/why-your-oilfield-will-fail-at-60-oil-...


Oh thanks. Did not know about the refining bit.

I am always pleasantly amused that many HN folks share with me a love for weaving, knitting and knotting; not to mention ropes.

Dang had once posted a long list of HN discussions on these topics.

I think there is something about them that squirts a little bit of dopamine in our pattern seeking, puzzle solving brains.

For me, one of draws was how does the symmetry of the woven pattern get weft into the cloth. Multi-shaft looms does it differently from, say, a Kashmiri rug.

When I had joined HN decades ago I had no idea that there would be this shared interest. Frankly, there were no reason for this to be the case.

Then one day this happened

https://news.ycombinator.com/item?id=44462404


Fun facts, TK Solver or TK!Solver original developer is Milos Konopaseka a textile engineer from from Czechoslovakia.

TK Solver is a software cousin of the infamous VisiCalc, developed by the same company Software Arts.

VisiCalc has been discontinued but TK Solver is still being sold today by Universal Technical Systems (UTS) [1].

Milos also developed the Question Answering System (QAS) running on a PDP-10. It operates on equations relating input yarn, cloth area, fiber strengths, etc. For a desired cloth strength, you could solve for fiber strength, or given fiber strength, you could solve for the cloth strength. The same operations you can still perform in TK Solver.

[1] Comprehensive Mathematical Software Tool for Engineers:

https://www.uts.com/Products/TKSolver


Jaquard loom was one of the first machines that could operate based on a set of symbols / patterns encoded on a punched card. Computers ran on punched cards until the 1970s. Voting machines used punched cards until pretty recently (infamous "hanging chad" from 2000 US election).

I have heard it said that the word "technology" shares its roots with the word "textiles". Maybe it's not so surprising that there would be a shared interest as well!

https://www.etymonline.com/word/*teks-

> Proto-Indo-European root meaning "to weave," also "to fabricate," especially with an ax, also "to make wicker or wattle fabric for (mud-covered) house walls."

> It might form all or part of: architect; context; dachshund; polytechnic; pretext; subtle; technical; techno-; technology; tectonic; tete; text; textile; tiller (n.1) "bar to turn the rudder of a boat;" tissue; toil (n.2) "net, snare."

> It might also be the source of: Sanskrit taksati "he fashions, constructs," taksan "carpenter;" Avestan taša "ax, hatchet," thwaxš- "be busy;" Old Persian taxš- "be active;" Latin texere "to weave, fabricate," tela "web, net, warp of a fabric;" Greek tekton "carpenter," tekhnē "art;" Old Church Slavonic tesla "ax, hatchet;" ...


According to William Dalrymple, India was once responsible for a third of the world's GDP, with the most advanced textile industry in the world before the East India Company dismantled it.

A Sanskrit origin is intriguing.


As a note, Sanskrit is a "sibling" or cousin of Latin or Greek in the family tree of languages ( https://upload.wikimedia.org/wikipedia/commons/4/4f/IndoEuro... ). Neither Latin nor Greek grew from Sanskrit but rather each (and many other languages) grew from Proto-Indo-European that was believed to exist somewhere in 4500 to 2500 BC.

https://en.wikipedia.org/wiki/Indo-European_vocabulary (the "Construction, fabrication" section includes *teks)


As a novice in the history of languages and being k-lingual in a couple of Indian languages and English, the Farsi language is such a delightful stream of discoveries.

Regardless of which k of my languages I restrict myself to, I end up discovering words that are same between Farsi and that language.

I understand that this should not be surprising given their roots in Indo-Iranian languages, the largest branches of Indo-European.

Nonetheless it is delightful everytime I discover a new one by accident.


Hmm, Finnish has "tehdä" (to do,make,fabricate) with forms like "tekee" and "teko-".

Huh; that seems like a way better etymology for the "tada!" flourish than any of the explanations in this rather heated discussion: https://english.stackexchange.com/questions/33564/origin-of-...

where did you think punch cards came from? you know, the punch cards that we use to represent the first computer programs?

https://en.wikipedia.org/wiki/Punched_card. read the precursor section.

Basile Bouchon developed the control of a loom by punched holes in paper tape in 1725. The design was improved by his assistant Jean-Baptiste Falcon and by Jacques Vaucanson.[5] Although these improvements controlled the patterns woven, they still required an assistant to operate the mechanism.

In 1804 Joseph Marie Jacquard demonstrated a mechanism to automate loom operation. A number of punched cards were linked into a chain of any length. Each card held the instructions for shedding (raising and lowering the warp) and selecting the shuttle for a single pass.[6]


Indeed.

To help debug the occasional 'dropped all the cards on the floor' accident, was the diagonal stripe across the side, after the cards have been stacked right.

This was used for computers for sure, not sure about the Jacquard looms.

With complete freedom in addressing (raising) any subset of the warps, these looms were very expressive. My favorite are multi shaft looms.

In a k-shaft loom you can only define k elementary subsets of all the warps. Makes for more interesting problem solving instances and mathematical structure.


Since you asked: That's exactly where I thought punch cards come from.

Always good to learn more about the timeline of techniques lost in the mists of time. Some of the finest works of art were 'coded' in fibers, much more durable that most other media!

Including, inasmuch as you can consider it fine art, the ROM for the Apollo onboard computer! https://en.wikipedia.org/wiki/Core_rope_memory

The creator of SNOBOL and Icon programming languages, Ralph Griswold, also developed an interest in weaving and wrote about it; see for instance https://www.thelacebee.com/the-lace-notes/tess-the-professor...

Thanks for the link. I did not know about this before. I have been to the bibliography page linked from there many times before but did not know the Icon connection.

Got reminded of Durer's exquisite knot works.


I think it's not just puzzle solving - for me it's the idea of creating something from raw materials where that something is itself a standard building block. it appeals to the same part of me that programming does.

Blake's and Durer's artwork are two of my favorites.

What I find so teasingly difficult to explain is that despite being so different there is some shared aesthetic value between them that I cannot quite pin down in words.

Perhaps their strong geometric undertones and a certain muscularity in them.


Non-“art first”, cosmological (in the religious sense), sketch-forward detail as principal expressive form… I mean one studied the other right? And the author of this piece wrote about Durer as well

Agree with your observation. Blake and Durer both worked in printmaking. I wonder if the processes and aesthetics there resulted in some detectable affinity between their works.

Toss in some Bosch for flavor.

Those three guys could wipe the floor with most of modern art.

(The Blake painting is tucked away in an almost-attic of the now "Tate Britain" old building in a quiet out of the way street, while the "Tate Modern" blockhouse graces the Thames south bank, mostly filled with glitzy trash. So it goes.)


Thanks for introducing me to Bosch. I immediately recognised many of his works, but the his name had not registered.

https://nautil.us/the-great-silence-237510

One of my all time favourite short stories, with or without intelligent parrots.

Time for me to read it again. This is the Arecibo story, don't miss if you haven't read it before.

"You be good".

Strangely enough, was having a lot of difficulty coaxing google to fetch this link.


I’m not sure if you used “Classic Google” or not, but I put the quoted quote in to Google AI Mode (disclaimer; I am one of its developers) and got a full description of the story with links to online hostings of the full text in under 1 second. Not the same URL as your result, and I don’t know the IP validity of the hosting result pages I got, though.

I recalled (once I was reminded of the author) that I read this originally in one of his Anthologies. I strongly recommend to everyone who likes reading and thinking to buy both of his books!


I got some of those links and links to the summary of the story.

But I did not want a summary (why massacre such a beautiful story *), and neither the later links (pretty bad visual presentation of the story), but the Nautilus link in particular.

I think that's where I had read it first on the web, by far the best layout compared to the other links.

Even a few years ago the Nautilus link used to be the canonical (first) result.

* If I want Michelangelo's David summarised, I think I would mention 'summary' explicitly.


Don't worry when stochastic grads get stuck math grads get going.

(One of) The value(s) that a math grad brings is debugging and fixing these ML models when training fails. Many would not have an idea about how to even begin debugging why the trained model is not working so well, let alone how to explore fixes.


Debugging ML models (large part of my job) requires very little math. Engineering experience and mindset is a lot more relevant for debugging. Complicated math is typically needed when you want invent new loss functions, or new methods for regularization, normalization or model compression.

You are perhaps talking about some simple plumbing bugs. There are other kinds:

Why didn't the training converge

Validation/test errors are great but why is performance in the wild so poor

Why is the model converging so soon

Why is this all zero

Why is this NaN

Model performance is not great, do I need to move to something more complicated or am I doing something wrong

Did the nature of the upstream data change ?

Sometimes this feature is missing, how should I deal with this

The training set and the data on which the model will be deployed are different. How to address this problem

The labelers labelled only the instances that are easy to label, not chosen uniformly from the data. How to train with such skewed label selection

I need to update model but with a few thousand data points but not train from scratch. How do I do it

Model too large which doubles can I replace with float32

So on and so forth. Many times models are given up on prematurely because the expertise to investigate lackluster performance does not exist in the team.


Literally every single example you provided does not require much math fundamentals. Just basic ML engineering knowledge. Are you saying that understanding things like numerical overflow or exploding gradients require sophisticated math background?

Numerical overflow mostly no, but in case of exploding gradient, yes especially about coming up with a way to handle it, on your own, from scratch. After all, it took the research community some time to figure out a fix for that.

But the examples you quoted were not my examples, at least not their primary movers (the NaNs could be caused by overflow but that overflow can have a deeper cause). The examples I gave have/had very different root causes at play and the fixes required some facility with maths, not to the extent that you have to be capable of discovering new math, or something so complicated as the geometry and topology of strings, but nonetheless math that requires grad school or advanced and gifted undergrad level math.

Coming back to numeric overflow that you mention. I can imagine a software engineer eventually figuring out that overflow was a root cause (sometimes they will not). However there's quite a gap between overflow recognition and say knowledge of numerical analysis that will help guide a fix.

You say > "literally every single example"... can be dealt without much math. I would be very keen to learn from you about how to deal with this one, say. Without much math.

   The labelers labelled only
   the instances that are
   easy to label, not chosen
   uniformly from the data.
   How to train with such
   skewed label selection 
   (without relabeling properly)
This is not a gotcha, a genuine curiosity here because it is always useful to understand a solution different from your own(mine).

Maybe I don’t understand this data labeling issue - are you talking about imbalanced classification dataset? Are hard classes under-represented or missing labels completely?

None of those (but they could be added to the mix to complicate matters).

Consider the case that the labelers creates the labelled training set by cherry picking those examples that are easy to label. He labels many, but selects the items to label according to his preference.

First question, is this even a problem. Yes, most likely. But why ? How to fix it ? When are such fixes even possible.


Yes, this is a problem - the most challenging samples might not even be present in your training data. This means your model will not perform well if real world data has lots of challenging samples.

This can be partially solved if we make some assumptions about your labeller:

1. they have still picked enough challenging samples.

2. their preferences are still based on features you care about.

3. he labelled the challenging samples correctly.

And probably some other assumptions should hold for distribution of labels, etc. But what we can do in this situation is first try to model that labeller preferences, by training a binary classifier - how likely he would choose this sample for labelling from the real-world distribution? If we train that classifier, we can then assign its confidence as a sample weight when preparing our training dataset (less likely samples get more weight). This would force our main classifier to pay more attention to the challenging samples during training.

This could help somewhat if all assumptions hold, but in practice I would not expect much improvement, and the solution above can easily make it worse - this problem needs to be solved by better labelling.

How did you solve it?


I would recommend that you start with one of the classics (not much of deep RL)

https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutto...

This will have a gentler learning curve. After this you can move on to more advanced material.

The other resource I will recommend is everything by Bertsekas. In this context, his books on dynamic programming and neurodyanamic programming.

Happy reading.


For that matter, they aren't really Arabic numbers, Europe got them from the Arabs though. Hindu-Arabic would be little more correct.

Liber Abaci by Leonardo of Pisa (Fibonacci) is an important interesting book to read. There he is trying to convince the readers to shift to this Hindu-Arabic system he had picked up from the Arabs.

The Fibonacci series is also introduced to the Europeans for the first time through this book. I don't recall whether he calls the series the Hindu series in this book or somewhere else. The series was known to Indian mathematicians (Pingala, Euclid's contemporary, roughly) as an enumeration sequence of short and long beats that an interval of time could be broken into.


Funnily enough it had always been geometry of motion for me.

Over the decades this effect has diluted somewhat, but for me time was always some landmark shapes of the hands of the clock and how far the current arrangement of the hands are from the chosen landmark. No names. No numbers.

This caused lots of problems when someone would ask me for the time. I really had to slow down and deliberately translate, with some conscious effort, what I saw into numbers and words. So for some tens of seconds I would be transfixed, frozen, time-sniped.

When I looked up time for myself I would skip the numbers and words entirely.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: