logo

5.1 How Staff Evaluation Really Works

At L5 (Senior), the promotion committee asks: “Did they do the work?”
At L6 (Staff), the committee asks: “Did they change the outcome?”
The rubric on the company wiki is a lie. It lists things like “Technical Excellence” and “Mentorship.” But behind closed doors, the calibration committee is using a completely different set of heuristics.
If you write your self-review like a Senior DS: listing every ticket, dashboard, and model you shipped, you will fail. Here is how the game is actually played.
A Senior promo packet is measured by volume. It is a list of 50 links to dashboards, queries, analysis, and experiments run.
A Staff promo packet is measured by density.
One paragraph in a Staff packet must encode a year of org-shaping work. You are not describing activity; you are describing a state change in the organization.
  • The Senior Bullet:
    • “Built a Churn Prediction Model (v2) using XGBoost with 85% AUC. Deployed to Airflow.” (Focus: output.)
  • The Staff Bullet:
    • “Identified that our definition of ‘Churn’ was a lagging indicator. Influenced the VP of Product to adopt ‘Activation’ as the new Q3 North Star. Rebuilt the measurement framework to support this pivot, saving the org 6 months of wasted optimization.” (Focus: trajectory change.)
The Unwritten Rule: If you delete the impact clause of your sentence and the work still sounds impressive (e.g., “I built a complex model”), you are writing a Senior packet. Staff work is only impressive because of the result, not the difficulty.
You think your manager writes your promo packet.
In reality, your packet is just a formality. The decision is made months before the packet is opened, in the shadow calibration.
Staff promotions require consensus across the leadership layer. Directors and VPs from other teams (Product, Engineering, Sales) must nod their heads when your name comes up.
  • The Litmus Test: When your name appears on the projector in the calibration room, does the VP of Product ask, “Who is that?” or do they say, “Oh, right, the person who saved the Pricing launch.”
  • The Unwritten Rule: You cannot introduce yourself in a promo packet. If the committee learns who you are during the review, it is too late. Your reputation must enter the room before you do.
This is the brutal mental model committees use to separate “Strong Seniors” from “Staff.”
They ask the counterfactual: “What would have happened if this person wasn’t there?”
  • If the answer is: “The analysis would have been two weeks late,” or “The code would be messier.” → You are a Senior. You are a capacity multiplier.
  • If the answer is: “We would have launched the wrong product,” or “We would still be optimizing for vanity metrics.” → You are Staff. You are a directional multiplier.
The Staff DS is not a “faster horse.” They are the GPS. If the GPS is missing, the car does not just go slower; it goes to the wrong destination.
Use this table to audit your own performance reviews.
Dimension
Senior (L5) Evaluation
Staff (L6) Evaluation
The Currency
Velocity / Complexity / Accuracy “Can they write complex code quickly & accurately?”
Economics & ROI (Revenue, OpEx, LTV) “Have they improved the ROI for the team?”
The Metric
Reliability — “Can I trust them to finish this task?”
Trajectory — “Did they change where the team is going?”
The Scope
Owned Systems (e.g., the pricing model).
Owned Outcomes (e.g., the profitability of the pricing line).
The Artifacts
Code, dashboards, documentation.
Strategy docs, protocols, decision records.
The “Win”
“I solved the hard problem.”
“I prevented the hard problem.”
The Feedback
“Great execution.”
“Great judgment.”
A Senior packet proves you were busy. A Staff packet proves you were necessary. If you can be replaced by a hiring requisition, you are not Staff.

5.2 The Three Failure Modes

When a Senior Data Scientist fails, it is loud. The dashboard breaks, the model drifts, the deadline is missed.
When a Staff Data Scientist fails, it is silent. You can work 60-hour weeks, receive praise from your peers, and still be failing completely.
These are the three most common traps that kill careers at L6.
You spend your week in meetings, “aligning stakeholders,” “providing context,” and “unblocking the team.” Everyone likes you. You feel like a leader. But when performance review season arrives, your packet is empty.
  • The Symptom: You are everywhere, but your name is on nothing. You have “influence,” but no artifacts. You are the “glue,” but glue is invisible.
  • Why It’s Seductive: It feels like “strategy.” It feeds the ego to be the person everyone whispers to for advice. It mimics the behavior of a Director without the authority of a Director.
  • The Early Indicator: Look at your calendar for the last month. If you cannot point to a single strategy doc, architecture review, or protocol that exists because of you—and would persist without you—you are a Ghost. Influence must be solidified into text, or it did not happen.
You are the first person called when the metric spikes or the pipeline halts. You jump in, debug the issue, patch the SQL, and save the day. The team cheers. You feel indispensable.
  • The Symptom: You are solving the same class of problem for the third time. You are effectively a “Super Senior” DS—using your Staff-level speed to do L5 execution work.
  • Why It’s Seductive: It provides the high-dopamine hits that the Staff role usually denies you. It is low-ambiguity work with immediate gratitude.
  • The Early Indicator: You are too busy to write the “system fix.” You are so busy bucketing water out of the boat that you have not had time to plug the hole.
    • The Unwritten Rule: If you fix a fire once, you are a hero. If you fix the same fire twice, you are a failure. A Staff DS fixes the system that caused the fire.
You know the legacy attribution model better than anyone. You know every quirk of the tracking pixels from 2019. You are the historian.
  • The Symptom: Your value is tied to a specific stack or domain, not your ability to solve new problems. When the company pivots to a new tech stack or business line, your stock plummets to zero.
  • Why It’s Seductive: It offers safety. No one can challenge you in your domain. You are the “local maximum” king.
  • The Early Indicator: You find yourself arguing against modernization or new tools, not because they are bad, but because they render your specific knowledge base obsolete. You are optimizing for your own relevance, not the company’s speed.
Trap
The Self-Correction Trigger
The Ghost
“Write it down.” If you influenced a decision, write the decision record to claim the win.
The Firefighter
“Let it burn.” Delegate the fire to a Senior DS (as a growth opportunity) and spend that time writing the root-cause prevention doc.
The Overfit
“Kill your darlings.” Proactively document your legacy knowledge and hand it off, forcing yourself to learn the new stack.