Hacker Newsnew | past | comments | ask | show | jobs | submit | Chabs's commentslogin

The Something Awful forums is the obvious candidate here. It costs a flat 10$ to create an account. This generally makes trolling, spamming and scamming a losing proposition and broadly helps with the signal to noise ratio.


It traditionally leads to bigger executables and longer compile times. With modern compilers, both of these are non-issues for "most" reasonably-sized projects.


Welp, I was not expecting this to show up on Hacker News so early in the project's life cycle.

This is still fairly early work, so I apologize if the API is not quite as neat as it could be yet.

Still, author here, I'll gladly accept any comments/criticisms.


std::visit mostly allows you to do that already:

https://coliru.stacked-crooked.com/a/be5c44281eea8bc4

Then only unfortunate missing piece of the puzzle is that there's no trivial way to create a closure out of this, so it requires a bit more manual work to propagate local state to the visitor.


If you build your visitor out of lambdas, instead of a struct, you can propagate local state to the visitor easily by using lambda captures. There are good examples of this approach at https://en.cppreference.com/w/cpp/utility/variant/visit


Unfortunately, you can't leverage template substitution rules with the lambda approach. And that's really necessary if you want to have actual powerful match expressions.


For those working with C++ an answer might come in C++23.

http://open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1371r0....


If I'm reading this right, the core of the argument is that since any continuous function is can be approximated via a taylor series expansion, then activation functions can be seen, in-effect, as polynomial in nature, and since a neuron layer is a linear transformation followed by an activation function, the the whole system is polynomial.

That's "technically" correct, but it feels like an academic cop-out. Interesting/useful transfer functions tend to be functions that take very large expansions to be approximated with any accuracy.


That is a serious practical problem that we find even in this paper. The authors were unable to fit a degree 3 polynomial to a subset (just 26 components) of MNIST data (handwritten digits) due to a "memory issue".

But mathematical theory need not be practical. The relation between NNs and polynomial regression might be a fruitful theoretical observation even if the equivalent polynomial regression is incalculable.


They were unable to fit a degree three polynomial, but were still able to obtain a 97% accuracy with a degree two polynomial.


I don't think it is even fruitful. We already know that mappings that don't contain poles can be approximated in various ways: Taylor expansion, piecewise linear, fourier transforms.. Taylor expansion is the polynomial fitting for authors, piecewise linear is NN with relu activation.


Not to mention that as a practical matter, the ability to train a neural network with backpropagation is important to get results that actually converge in a reasonable amount of time. It's not useful to say "but you could just use a polynomial regression" if you can't actually generate the equivalent polynomial regression in the same amount of time that you can train a neural network.


I'm not sure that's actually correct. In fact I'm sure it is incorrect for a polynomial of degree 1. Otherwise, there's nothing special about relu or tanh that you can't use sequential/backprop on polynomial regression in general.


Not to be nitpicky: it is not the Taylor polynomial (it does not approach any continuous function). It is a result by weierstrass on polynomial approximation on a closed interval.

f(x)=exp(-x^2) has the same Taylor expansion as g(x)=0 at x=0.


Wouldn’t the math academics have seen this in literally one second. How was this not asserted even earlier? Just genuinely curious as a completely unaware programmer


My gut feeling is that yes, this is pretty much self evident.

However, the interesting part of the paper is that they use that equivalency to propose that properties of polynomial regression are applicable to Neural networks, and draw some conclusions from that.


> any continuous function is can be approximated via a taylor series expansion We can get a good polynomial approximation of any continuous function but just on bounded set. Wouldn't such assumption (restriction of activation function domain) be problematic?


I think that's a very good point. Yes, you can approximate any given clasical NN with a polynomial, but how does the number of terms in the polynomial scale with the network size and the desired accuracy? There might be a very good paper there.


C++, as a language, has never cared about the notion of "files". The entire standard is defined as a function of a "Translation Unit", which is an abstract notion that we tend to associate with "a single .cpp file" by nothing but convention.

Since modules operate at the language level, they need to operate on this notion, which precludes importing by file.


But a header is a file, no? And it is referenced explicitly by its file name.


No, the language of the standard carefully omits talking about files. This is because there are still ancient mainframe operating systems around that do not have typical hierarchical filesystems, but it is still technically possible to provide a C++ implementation for these. Prescribing a module name to file name mapping woukd not work in these environments either. This is also why #pragma once was rejected and the replacement #once [unique ID] was invented instead: just defining what is and isn't the same file for #pragma once turned out too difficult to define.


What I don't get though is why these ancient mainframes need the latest version of the standard. I can't imagine the compiler writers for these OSs to be too eager implement any change at all. You said "technically possible", are you implying that nobody actually does? What are these OSs?

To me this seems like a weird take on accessibility. In order to accommodate that one OS that has some serious disabilities, everyone else has to suffer the consequences. Why not build a ramp for that one OS, and build stairs for everyone else?


IBM has multiple people in the standard committee and they care a lot for both backward compatibility and new standards. They alone were strongly opposed from removing trigraphs from the standard.

Still trigraphs were removed in the end; if there is enough support the committee is willing to break backward compatibility.


>just defining what is and isn't the same file for #pragma once turned out too difficult to define.

Admittedly that's not just a problem with old mainframes. Any system supporting file aliases (be it hardlinks, symlinks or the same FS mounted at several locations for instance) would be tricky to handle.

I always thought #pragma once was a bad idea for that reason, header guards with unique IDs don't require any compiler magic and are simple to reason about without having to read the standard or compiler's docs to figure out how it operates.


That's handled by the preprocessor. It's literally just a "insert the contents of that file here" copy-paste.


But the preprocessor is part of the C++ standard, no? I'm really not seeing why it's ok for the preprocessor to refer to files but not the language.

Also, going to this level of trouble to support systems that don't have files seems... odd. Targets that don't have files, that I can totally understand. But compiler toolchains that don't have the notion of a file? That sounds obscure beyond obscure. I'm surprised such a system would be a compilation host instead of a cross-compile target.


We are talking about a language that goes so far as to make sure it functions on systems where the size of a byte is not 8, or where memory is not necessarily linearly addressed. People tend to forget how shockingly flexible standard-compliant C++ code actually is.


I get that, but there's also precedent for cutting ancient things loose. Both C and C++ have finally decided to specify that signed integers are two's complement: https://twitter.com/jfbastien/status/989242576598327296?lang... Also trigraphs are gone in C++17.


This C++ code actually compiles with clang++. Incredible!

    int main(int argc, char *argv<::>)
    <%
        if (argc not_eq 1 and argc not_eq 2) <% return 1; %>
        return 0;
    %>
https://en.wikipedia.org/wiki/Digraphs_and_trigraphs#C


Digraphs are still a part of the language. I would be more surprised if a conformant piece of code did not compile with a conformant compiler.


Trigraphs are gone, but it took a while to win IBM representatives over it


I don't think they were ever really won over, I think their concerns were heard and they begrudgingly acquiesced rather than vote down C++17.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n421...


Hot take: success is almost always accompanied by a healthy dose of survivor bias. For each of these success stories, there are a lot of good people who did the exact same thing, but it didn't pan out because of timing, location, or other similar factors.


constexpr variables must have a literal type, which requires a trivial destructor.


The most egregious thing for me is:

> in other words, the reader is expected to understand the semantics of an unknown function without consulting the declaration (or documentation).

The idea that the author considers making code as self-documenting as possible at the call site a conceptual mistake is just weird to me.


Speculation: GPU-based text rendering is typically going to be done using two triangles per glyph. So what they are seeing is probably not the allocation of the underlying texture storage, but the transient vertex buffer that contains the information of which glyph to render and where to render them.


Well, most GPU-based painting backends would try to recycle those VBOs to avoid reallocating them every frame. But it's hard to get all the heuristics right; cache invalidation is one of the two hard problems of computer science, after all…


Interesting, that sounds plausible and reasonable, except that the authors mention the minimum granularity of memory reporting is 128KB blocks. For this attack to work, the response to a keystroke has to reliably allocate more than 128KB every time. The vertex and index buffers for a pair of triangles is less than 100 bytes, right? So if only geometry were allocated, you'd only see measurable allocations every once in a while and not for every keystroke. What else might be getting allocated?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: