Data scientist, and I have had a few examples of seemingly simple "how would you build a schema" job interview questions that I had a difficult time with on the spot.
So last one I remember was how would you build a product table with coupons. Ok, so two tables right, no big deal. Well, we are going to need to keep a history right? So now I need to update and have datetimes for different products and coupons. And now I should think about how to do indexes on the tables, and gosh my join to get the discounted price is that a good way to do that? Most coupons only allow a person to use them once, how the hell am I going to implement that?
They probably just wanted the simple product + coupon table, but let me spin on it for quite a while like a madman.
I'd say this is exactly what the interviewers wanted. They're interested in how you break down the problem, the types of solutions you consider, your understanding of the trade-offs involved. For example, I interviewed somebody who was adamant they could prevent double-booking by polling an end-point and storing the state in Redux. Fantastic JavaScript skills, terrible knowledge of databases.
I don't like these questions. Data warehouse schema design is out of scope for data science, it's data engineering. Yes we have to do a lot of ad-hoc data engineering along the way, but it's such a strange thing to interview for in lieu of the many possible data/math/stats skills and more directly relevant programming skills. It signals a lack of respect for division of labor and specialization, and that lack of respect will be visible in the form of a stretched inefficient team.
To be clear, I think the ability to be your own data engineer is a great attribute as a data scientist. I just don't think it's reasonable to expect it: it's not part of the core job description.
I expect a skilled data scientist to be able to articulate the complexities of the real world in relational data. Otherwise, how the hell will they be able to infer the real world from relational data?
So last one I remember was how would you build a product table with coupons. Ok, so two tables right, no big deal. Well, we are going to need to keep a history right? So now I need to update and have datetimes for different products and coupons. And now I should think about how to do indexes on the tables, and gosh my join to get the discounted price is that a good way to do that? Most coupons only allow a person to use them once, how the hell am I going to implement that?
They probably just wanted the simple product + coupon table, but let me spin on it for quite a while like a madman.