I am a little bitter that it is trained on stuff that I gave away for free and will be used by a billion dollar company to make more money. I contributed the majority of that code before it was even owned by Microsoft.
> We pre-train our model on selected public GitHub code and fine-tune it on our relatively small competitive programming dataset.
But since the code was 'selected' you don't know if your code was used. However, they seem to have used Python and C++, so my code is probably not part of it.
They opensourced alphafold for anyone to use commercially despite big financial incentive to keep it private and use in their new drug discovery lab. No idea how this works or differs from alphafold but imagine they'll do the same here if possible
The problem is not really that microsoft owns github, or that licenses allow corporations free use, but that the tech giants are so big and have so much power.