More

Chabs · on Feb 13, 2020

The Something Awful forums is the obvious candidate here. It costs a flat 10$ to create an account. This generally makes trolling, spamming and scamming a losing proposition and broadly helps with the signal to noise ratio.

Chabs · on Dec 12, 2019

It traditionally leads to bigger executables and longer compile times. With modern compilers, both of these are non-issues for "most" reasonably-sized projects.

Chabs · on June 10, 2019

Welp, I was not expecting this to show up on Hacker News so early in the project's life cycle.

This is still fairly early work, so I apologize if the API is not quite as neat as it could be yet.

Still, author here, I'll gladly accept any comments/criticisms.

Chabs · on March 6, 2019

std::visit mostly allows you to do that already:

https://coliru.stacked-crooked.com/a/be5c44281eea8bc4

Then only unfortunate missing piece of the puzzle is that there's no trivial way to create a closure out of this, so it requires a bit more manual work to propagate local state to the visitor.

ddalcino · on March 6, 2019

If you build your visitor out of lambdas, instead of a struct, you can propagate local state to the visitor easily by using lambda captures. There are good examples of this approach at https://en.cppreference.com/w/cpp/utility/variant/visit

Chabs · on March 6, 2019

Unfortunately, you can't leverage template substitution rules with the lambda approach. And that's really necessary if you want to have actual powerful match expressions.

pjmlp · on March 6, 2019

For those working with C++ an answer might come in C++23.

http://open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1371r0....

Chabs · on Feb 14, 2019

If I'm reading this right, the core of the argument is that since any continuous function is can be approximated via a taylor series expansion, then activation functions can be seen, in-effect, as polynomial in nature, and since a neuron layer is a linear transformation followed by an activation function, the the whole system is polynomial.

That's "technically" correct, but it feels like an academic cop-out. Interesting/useful transfer functions tend to be functions that take very large expansions to be approximated with any accuracy.

leereeves · on Feb 14, 2019

That is a serious practical problem that we find even in this paper. The authors were unable to fit a degree 3 polynomial to a subset (just 26 components) of MNIST data (handwritten digits) due to a "memory issue".

But mathematical theory need not be practical. The relation between NNs and polynomial regression might be a fruitful theoretical observation even if the equivalent polynomial regression is incalculable.

lalaithion · on Feb 14, 2019

They were unable to fit a degree three polynomial, but were still able to obtain a 97% accuracy with a degree two polynomial.

ivalm · on Feb 14, 2019

I don't think it is even fruitful. We already know that mappings that don't contain poles can be approximated in various ways: Taylor expansion, piecewise linear, fourier transforms.. Taylor expansion is the polynomial fitting for authors, piecewise linear is NN with relu activation.

eklitzke · on Feb 14, 2019

Not to mention that as a practical matter, the ability to train a neural network with backpropagation is important to get results that actually converge in a reasonable amount of time. It's not useful to say "but you could just use a polynomial regression" if you can't actually generate the equivalent polynomial regression in the same amount of time that you can train a neural network.

scottlocklin · on Feb 14, 2019

I'm not sure that's actually correct. In fact I'm sure it is incorrect for a polynomial of degree 1. Otherwise, there's nothing special about relu or tanh that you can't use sequential/backprop on polynomial regression in general.

pfortuny · on Feb 14, 2019

Not to be nitpicky: it is not the Taylor polynomial (it does not approach any continuous function). It is a result by weierstrass on polynomial approximation on a closed interval.

f(x)=exp(-x^2) has the same Taylor expansion as g(x)=0 at x=0.

aqwsedopl · on Feb 14, 2019

Wouldn’t the math academics have seen this in literally one second. How was this not asserted even earlier? Just genuinely curious as a completely unaware programmer

Chabs · on Feb 14, 2019

My gut feeling is that yes, this is pretty much self evident.

However, the interesting part of the paper is that they use that equivalency to propose that properties of polynomial regression are applicable to Neural networks, and draw some conclusions from that.

krcz · on Feb 14, 2019

> any continuous function is can be approximated via a taylor series expansion We can get a good polynomial approximation of any continuous function but just on bounded set. Wouldn't such assumption (restriction of activation function domain) be problematic?

joker3 · on Feb 14, 2019

I think that's a very good point. Yes, you can approximate any given clasical NN with a polynomial, but how does the number of terms in the polynomial scale with the network size and the desired accuracy? There might be a very good paper there.

Chabs · on Jan 28, 2019

C++, as a language, has never cared about the notion of "files". The entire standard is defined as a function of a "Translation Unit", which is an abstract notion that we tend to associate with "a single .cpp file" by nothing but convention.

Since modules operate at the language level, they need to operate on this notion, which precludes importing by file.

brianberns · on Jan 28, 2019

But a header is a file, no? And it is referenced explicitly by its file name.

gmueckl · on Jan 28, 2019

No, the language of the standard carefully omits talking about files. This is because there are still ancient mainframe operating systems around that do not have typical hierarchical filesystems, but it is still technically possible to provide a C++ implementation for these. Prescribing a module name to file name mapping woukd not work in these environments either. This is also why #pragma once was rejected and the replacement #once [unique ID] was invented instead: just defining what is and isn't the same file for #pragma once turned out too difficult to define.

ahaferburg · on Jan 28, 2019

What I don't get though is why these ancient mainframes need the latest version of the standard. I can't imagine the compiler writers for these OSs to be too eager implement any change at all. You said "technically possible", are you implying that nobody actually does? What are these OSs?

To me this seems like a weird take on accessibility. In order to accommodate that one OS that has some serious disabilities, everyone else has to suffer the consequences. Why not build a ramp for that one OS, and build stairs for everyone else?

gpderetta · on Jan 29, 2019

IBM has multiple people in the standard committee and they care a lot for both backward compatibility and new standards. They alone were strongly opposed from removing trigraphs from the standard.

Still trigraphs were removed in the end; if there is enough support the committee is willing to break backward compatibility.

simias · on Jan 28, 2019

>just defining what is and isn't the same file for #pragma once turned out too difficult to define.

Admittedly that's not just a problem with old mainframes. Any system supporting file aliases (be it hardlinks, symlinks or the same FS mounted at several locations for instance) would be tricky to handle.

I always thought #pragma once was a bad idea for that reason, header guards with unique IDs don't require any compiler magic and are simple to reason about without having to read the standard or compiler's docs to figure out how it operates.

Chabs · on Jan 28, 2019

That's handled by the preprocessor. It's literally just a "insert the contents of that file here" copy-paste.

haberman · on Jan 28, 2019

But the preprocessor is part of the C++ standard, no? I'm really not seeing why it's ok for the preprocessor to refer to files but not the language.

Also, going to this level of trouble to support systems that don't have files seems... odd. Targets that don't have files, that I can totally understand. But compiler toolchains that don't have the notion of a file? That sounds obscure beyond obscure. I'm surprised such a system would be a compilation host instead of a cross-compile target.

Chabs · on Jan 28, 2019

We are talking about a language that goes so far as to make sure it functions on systems where the size of a byte is not 8, or where memory is not necessarily linearly addressed. People tend to forget how shockingly flexible standard-compliant C++ code actually is.

haberman · on Jan 28, 2019

I get that, but there's also precedent for cutting ancient things loose. Both C and C++ have finally decided to specify that signed integers are two's complement: https://twitter.com/jfbastien/status/989242576598327296?lang... Also trigraphs are gone in C++17.

gardaani · on Jan 28, 2019

This C++ code actually compiles with clang++. Incredible!

    int main(int argc, char *argv<::>)
    <%
        if (argc not_eq 1 and argc not_eq 2) <% return 1; %>
        return 0;
    %>

https://en.wikipedia.org/wiki/Digraphs_and_trigraphs#C

bregma · on Jan 28, 2019

Digraphs are still a part of the language. I would be more surprised if a conformant piece of code did not compile with a conformant compiler.

pjmlp · on Jan 28, 2019

Trigraphs are gone, but it took a while to win IBM representatives over it

favorited · on Jan 28, 2019

I don't think they were ever really won over, I think their concerns were heard and they begrudgingly acquiesced rather than vote down C++17.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n421...

Chabs · on Dec 7, 2018

Hot take: success is almost always accompanied by a healthy dose of survivor bias. For each of these success stories, there are a lot of good people who did the exact same thing, but it didn't pan out because of timing, location, or other similar factors.

Chabs · on Nov 28, 2018

constexpr variables must have a literal type, which requires a trivial destructor.

Chabs · on Nov 28, 2018

The most egregious thing for me is:

> in other words, the reader is expected to understand the semantics of an unknown function without consulting the declaration (or documentation).

The idea that the author considers making code as self-documenting as possible at the call site a conceptual mistake is just weird to me.

Chabs · on Nov 14, 2018

Speculation: GPU-based text rendering is typically going to be done using two triangles per glyph. So what they are seeing is probably not the allocation of the underlying texture storage, but the transient vertex buffer that contains the information of which glyph to render and where to render them.

pcwalton · on Nov 14, 2018

Well, most GPU-based painting backends would try to recycle those VBOs to avoid reallocating them every frame. But it's hard to get all the heuristics right; cache invalidation is one of the two hard problems of computer science, after all…

dahart · on Nov 14, 2018

Interesting, that sounds plausible and reasonable, except that the authors mention the minimum granularity of memory reporting is 128KB blocks. For this attack to work, the response to a keystroke has to reliably allocate more than 128KB every time. The vertex and index buffers for a pair of triangles is less than 100 bytes, right? So if only geometry were allocated, you'd only see measurable allocations every once in a while and not for every keystroke. What else might be getting allocated?