The code used to generate the graphs and the KS Area Between Curves computation in this post is available as part of the dython library

Image by lloorraa from Pixabay

Here’s a classic scenario you probably ran into as a Data Scientist — you need to train a binary classifier, say a simple logistic regression, over a rather small and imbalanced data — for example: CTR prediction (Click-Through Rate, meaning — will a user click on an ad or not). So again, you don’t have a lot of data, and it’s highly imbalanced, somewhere around the negative/positive ratio of 1000:1.

Your goal, then, is to…


Image taken from Jason’s Movie Blog

One of my favorite movies as a kid was Independence Day (here in Israel it was known as The Third Day, as it takes place on July 4th, and, well, most of the world doesn’t celebrate independency on that day). Briefly, it tells the story of an alien invasion to Earth, and how pretty much the entire human race unites in its do-or-die battle against the extraterrestrial colonialists. The fiery speech of President Bill Pullman still echos in my head: “The 4th of July will no longer be known as an American holiday, but as the day when the world…


Image by Free-Photos from Pixabay

The field of Machine Learning and Artificial Intelligence is changing rapidly. Five years ago, classical Machine Learning was the hottest trend; now it’s just like an iPhone 6S — outdated. Deep Learning dominates the market these days, and if you’ll come back to this post in 2025, there’s a good chance we’ve moved way past beyond this (self-note: I bet on Deep Reinforcement Learning).

Being a data scientist requires you to keep up with the latest innovations and discoveries, but there’s so much information coming in from so many directions, it’s easy to get lost in the stream. So what…


Photo by StartupStockPhotos from Pixabay

In case you missed it, there’s a pandemic out there, and it forces all of us to shut down all public events. As time goes by, we all begin to understand the impact of the lockdowns, social distancing and absence of gatherings. One of the things we realized, and by “we” I refer to the Algo group at Taboola, where I work, is the impact this has on those who are just beginning their career path or are about to shift it.

We used to host and attend many data science meetups and conferences, and noticed that many junior data…


Original image by Vadim_P from Pixabay

This blogpost is now available in Polish too, read it on BulldogJob.pl

About two years ago I published my very first data-science related blogpost. It was about Categorical Correlations, and I honestly thought no-one will find it useful. It was just experimental, and for myself. 1.7K claps later, I’ve learned that I cannot determine what other people will find useful, and I’m quite happy I can assist others on the web like others on the web assist me.

I was also quite new to Python and Github at that time, so I also experimented with writing the code to these…


The ROC graphs generating code used in this post is available as part of the dython library, which can be found on my GitHub page. Examples seen in this post are also available as a notebook.

ROCking hard (original image by Nadine_Em from Pixabay)

Assessing the predictions of any machine-learning model is probably the most important task of a Data Scientist — perhaps even more than actually developing the model. After all, while building super complex algorithms is the coolest thing, not knowing how to estimate their output properly is not the coolest thing.

There are several algorithms and tools dedicated to allowing a clearer view of how…


Implementations of all algorithms discussed in this blogpost can be found on my GitHub page.

The Qrash Course Series:

  1. Part 1: Introduction to Reinforcement Learning and Q-Learning
  2. Part 2: Policy Gradients and Actor-Critic

The previous — and first — Qrash Course post took us from knowing pretty much nothing about Reinforcement Learning all the way to fully understand one of the most fundamental algorithms of RL: Q Learning, as well as its Deep Learning version, Deep Q-Network. Let’s continue our journey and introduce two more algorithms: Gradient Policy and Actor-Critic. …


This blog post was originally published on Taboola’s Engineering Blog.

Our core business at Taboola is to provide the surfers-of-the-web with personalized content recommendations wherever they might surf. We do so using state of the art Deep Learning methods, which learn what to display to each user from our growing pool of articles and advertisements. But as we challenge ourselves manifesting better models and better predictions, we also find ourselves constantly facing another issue — how do we not listen to our models. Or in other words: how do we explore better?

As I’ve just mentioned, our pool of articles…


This blog post was originally published on Taboola’s Engineering Blog.

If you happen to write code for a living, there’s a pretty good chance you’ve found yourself explaining another interviewer again how to reverse a linked list or how to tell if a string contains only digits. Usually, the necessity of this B.Sc. material ends once a contract is signed, as most of these low-level questions are dealt with for us under-the-hood of modern coding languages and external libraries.

Still, not long ago we found ourselves facing one such question in real-life: find an efficient algorithm for real-time weighted sampling


The Tic-Tac-Toe game described in this post, as well as all algorithms and pre-trained models can be found on the tic_tac_toe repository on my GitHub page.

When I’m being asked to describe what fascinates me so much about Reinforcement Learning, I usually explain that I see it as if I train my computer in the same way I trained my dog — using nothing but rewards. My dog learned to sit, wait, come over, stand, lie down and pretend to be shot at (kudos to my wife), all in the exact same way — I rewarded her every time she…

Shaked Zychlinski

Research Team Lead at Lightricks. Previously Algorithm Engineer at Taboola & Data Engineer at Appsflyer. Lives in Tel Aviv, Israel. See me on shakedzy.xyz

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store