Ideas

Spreading Dataset 🙊

Debate contains a natural challenge dataset for transcription tasks. Events like LD and policy feature spreading (an abbreviation of “speed-reading”), where debaters attempt to speak as fast as possible while maintaining coherence. Create a web interface where debaters can interact with the existing Whisper API and “grade” their spreading. Get at least 100 hours of good spreading. Voila! A benchmark.

Progress Library ✅

A Python library that makes it easy to regularly save your work on a long job. If you’re processing 1M samples with some function, the pickler would dump your work into a pickle every X (say 10,000) runs.

Club Politics 🗳

How do you start a college club? What is a good leadership structure? What are the key ingredients to long-term survival? Which kinds of disasters end up sinking the club? Someone do a sociological study of clubs on campus!

Implied Prestige ✨

A measure of the prestige of an employer: how often people leave other employers to come work there. You can measure this on a huge scale with a massive LinkedIn dataset. But as far as I know, such a dataset doesn’t exist yet (publicly). Anyone want to do some illegal scraping with me?

Enrollment Time is an Instrument ⏰

Berkeley enrollment is notorious. Students with late enrollment times—which are randomly assigned—often can’t get into the classes they want. The keyword randomly should make any economist smile. It provides a perfect instrument to test a variety of hypotheses related to high-demand classes. Does doing PE classes improve GPA? Does missing out on classes like DATA 100 leave students unprepared for later classes? Can we access and predict outcome variables like mental/physical health, life satisfaction, or career choice?

Interactive Linear Models 📈

Linear models shouldn’t be that hard to work with. Even with Sci-kit learn, building even a basic OLS model can cause headaches; building and testing new features is even more frustrating, requiring a whole lot of re-running code. A new concept: input a Pandas dataframe and instantly start playing around with linear models in an dynamic interface. Use checkboxes to enable or disable features. Easily engineer new features with buttons for transformations, interactions, boolean conditions, etc. Select your outcome(s) of choice, whether it be train accuracy, test accuracy, cross-validated loss, or so on. Can it be easily misused to produce spurious models? Sure, but so can existing tools. It does nothing else than speed-up the existing data science lifecycle.

P = ? 🧐

When I’m using BerkeleyTime (a student-run course catalog for Berkleey) to check course averages, I often find myself balking at the number of P’s and NP’s. For a class like CS 189, where fully 30% of all students opt to take the class P/NP, it’s quite hard to look at the left-skewed distribution and feel comfortable. Presumably, most of those P’s are actually Cs and Bs. I suspect that the letter grades underlying Ps are similarly distributed across classes, and that using the true distributions of Ps for select classes, we could reasonably estimate the real, letter grade distributions of class grades. Beyond helping lazy assholes like me select easy classes, such a tool could help us understand grade inflation, rates of cheating, effect of e-learning during COVID (which coincided with lax P/NP rules), and so on.

Speed Rubber Bands 🏎

When I’m driving and stuck behind a slow driver, I feel a kind of “distance debt” accumulating inside me. After passing the snail, I usually drive much faster than I usually would, in some sense making up for the lost distance. This should have implications for the design of speed bumps and slower zones. Do people respond by speeding up afterwards? Does that introduce externalities?

Strava for Gymheads 🏋️‍♂️

This is a space where there’s lots of room for product innovation. The current Strava-esque social network for weightlifting is BodySpace, which has a dreadful UI and what seems like a fairly inactive network. A smart fitness app that integrates with a bunch of other workout trackers (e.g. Strong) could gain a lot more traction.

Self-Grade Design ✍️

Two intro CS courses at Berkeley, EECS 16A and EECS 16B, both save time by having students grade their own homeworks. The readers then grade a subset of the problems and scale the rest of the grades by the discrepancy between “official” grades and self grades. In theory, if students put good faith effort into self-grades, this a cost-effective and usually-fair way of assigning grades. But I’m worried about (a) the extent and unfairness of noise and (b) the incentives to put an honest effort in self-grades. Under what conditions should students put in such an effort? And how unfair are these systems?

Lazy Learners 😪

Every high school had them. Once they get into college, it’s all about maintaining the minimum GPA to not get rescinded. So many people learn only because they have to. That’s a problem, since few people disagree that intrinsic motivation is important for learning. To test this theory, we could proxy the role of extrinsic/intrinsic motivations to learn by comparing high school grades before and after entering college.

No Heights on Zoom 💂

Tall people have an advantage in life, especially in terms of income. But research suggests that the underlying factor is nutrition—well-fed children are both taller, smarter, and more charming. I suspect that some of the effect is driven by bias (and I’m not just salty). COVID presents a natural experiment: comparing job prospects for tall people before and after going on Zoom could reveal just how much being tall helps you in an interview or in the workplace.

Scientists and Surnames 👩🏽‍🔬

Adopting a partner’s surname is an important decision. Even more so for academics, who are often referred to as (Last Name, Year). Do female scientists and academics take a career hit after changing their surname? In particular, do newlyweds experience fewer citations in their new papers?

Correcting Round Investing 🟠

Strong evidence suggests that investors suffer from round number bias. I never hear an investor talking about “investing $$$1.69 million.” Instead, they talk about “buying 50 shares” or “selling at $$$0.” What if a hedge fund implement a procedure, where investors where forced to choose each digit independently: do you wish to buy 40 million? Or 41 million? 42? Would improving the granularity of judgements improve returns?

Police Stop 🚓

Every day of senior year, I drove a 30-minute trudge down to high school along Rockville Pike. Whenever there was a police car, I was sure to be late by 20 or more minutes. Whether its an accident or a traffic violation, the arrival of the police dramatizes the situation, clogs up the roadways, and generally pisses everyone off. What are the economic costs to a police stop? Considering the benefits in terms of deterrence and assistance outweigh the harms?

Imperceptible Weights 🏋️

So much of lifting is mental. I bench 135. 155 is way too much. No fear! Hollow out the barbell of a bench press and make a set of insertible half-pound weights. Secretly, of course. Have trainers slowly add 1 pound to the same 5x5 workout every day, and voila, underconfident lifters can steadily make progress.

Unified Club App 1️⃣

Consulting clubs, machine learning teams, software development groups, business frats—tens of student orgs, each with their own application that says the same dumb thing in slightly different ways. Let’s strip down the app to the essentials (name, email, academics, resume) and let orgs pick out students to interview or invite to apply.

LinkedIn Experience Effects 👩‍💼

Ulrike Malmendier is known for her study of experience effects–the unique inlfuence of personal experience on behavior above and beyond acquiring information. In a 2020 keynote, she laments not having more granular data on individual histories. Au contraire! LinkedIn provides fairly detailed employment histories of many CEOs, from GE to Netflix. If a CEO previously saw a company through declines, will they be more prudent in their current role?

drivers license 🏁

DMVs have loosened road test requirements to reduce COVID-19 risk. In Maryland, passing the test only requires you to park your car. Do lower standards for new drivers increase risk of accidents? Does the adaptation still pass benefit-cost-analysis after considering these externalities? Are we really keeping teens like Olivia Rodrigo safe?

Chipotle-Starbucks Pairs 🌯

There always seems to be a Starbucks next to Chipotle. Am I just imagining it? Luckily, there’s free data on Chipotle locations and Starbucks locations. Using geolocations, we can evaluate how likely Starbucks and Chipotles are to be within a given radius. To establish a baseline, we can use a random sample of general fast food restaurants.

Jogging Wind-downs 🏃

Khaneman’s famous studies on pain demonstrated that the peak and end of an experience dominate our retrospective evaluations. In the context of jogging, that might mean that two things are important: how painful was the steepest uphill, and how smooth was the wind-down? Perhaps people would have much brighter views of jogging if they ended each jog with a slow walk, basking in endorphins.

Debate Styles 🧥

We previously compiled a large database of wiki data, where each debater is mapped to their citations. If we treat each debater as a “document,” can we perform topic modeling on their citations and extract “types” of debaters? Would we see topics that resemble “philosophy,” “kritiks,” and so on?

Debate-Powered Research 🔌

Debate is a research powerhouse because of the competitive incentives to find the best arguments and the collaborative nature of teams. Put together, the LD debate community probably puts together literally tens of thousands of hours of research on each topic.

What if we could harness that power for a pressing public policy question? Partner with a respectable policy institute like the Brookings Institute to host a free tournament with a large (maybe around $25,000) prize pool. There’s an application: turn in all of your research. Subject matter experts judge the applications and debate, pressuring debaters to find the best research. IPPF is probably the closest existing model.

Lane Switching 🚘

Investors trade too much and would often be better off holding on to their stocks. There are many expectations. Maybe they overreact to information, or excessively fear missing out. Do the same principles apply to switching lanes in heavy traffic? On a congested highway, I sometimes switch when I see a faster neighboring lane, only for that lane to slow down immediately after; I should just “hold” my current lane. What’s the optimal way to weave between rush hour traffic?

Sleepy Hackers 😴

Hackathons are notorious for pressuring participants to stay up, often for over 24 hours at a time. Caffeine cookies, candies, waters, and more flow freely from the stands. From a reader of Why We Sleep, the ritual is a health catastrophe. So do the data bear that out? How do participants feel 1 day after the event? Two days? How do measures of cognitive function change? How many don’t even make it home, getting into accidents along the way?

Sweat per Dollar 💦

Mental accounting refers to inconsistent valuations of money. You’re more likely to spend a bonus on jewelry than your monthly salary. But why? I can’t find a convincing unified theory. I suspect it has to do with the effort exherted to earn the money, or the sweat per dollar. Whenever I consider buying a $8.35 Chipotle bowl, I think: “is this worth holding 30 minutes of office hours?” Here’s how we could test it. Suppose there are 3 studies being conducted that pay the same amount to participants but vary in difficulty. Follow-up with the participants in a week. Do the people who worked “harder” to get the money tend to spend it more prudently?

Admissions Noise Audit 🎧

In Noise, Khaneman et al. cite a study by Uri Simonsohn showing that cloudy weather induces college admissions officers to prefer and admit “nerdy” students. No surprise: the people in admissions panels are biased by arbitrary factors. But are they noisy? Could we conduct a nosie audit on a college admissions office, maybe at UC Berkeley?

Kantian Nudges 👈

Nudges have a slight PR problem. Some libertarians and ethicists worry about the risk of paternalism in swaying people’s choices. Mark D. White’s argument is the closest I’ve seen to a well-formed philosophical argument. What would a more in depth, Kantian analysis suggest? More liberal readings of Kant, such as those articulated by Ripstein, might have a different perspective than the libertarians.

Chinese Names 🇨🇳

Research has uncovered the impact of names on fields ranging from academic success to dating. But most of these studies were conducted on English names. Chinese (and other languages) has tones, including rising and falling tones. And, studies have shown that the inflection of speech (e.g. in Californian uptalk) can influence percieved respect. In China, does the tone of your name affect your chances of success?

Clubs for Cubs 🏘️

A comprehensive and user-friendly UI that builds on CalLink, providing additional information such as club activities, club size, application process (if any), websites, social media, points of contact, and—perhaps controverisally—percieved ranking based on reviews. New golden bears should be able to easily find a community that suits them, without surprises.

Dry Routing ⛱️

Have you ever had to walk home in the rain? Or across campus to a class while its pouring? When I faced the situations in the past week, the real task was dry routing: finding a reasonably-lengthed path that minimized my exposure to rain. But I’m not familiar with lots of Berkeley, and I definitely don’t remember which streets have overhangs. Google and Apple maps could be much more helpful if they could do dry routing for me, identifying which streets are most dry during a storm. Some approaches could be to use Google Street View data or to infer dry routes from pedestrian behavior (people probably walk faster in the rain, slower when dry).

Course Evaluations 📧

Every semester, I get at least 3 emails from the Berkeley administrators yelling at me to filling out my course evaluations. It looks like this:

Peter, Your course evaluations will be closing very soon. See below to view the precise deadline for your evaluation(s). We have not yet received feedback for all of your courses. We appreciate that the end of the term is very busy and hope you can take a few minutes now to share your feedback.

We could reach out to the Learning Environment + Tools team at Berkeley Research, Teaching, and Learning with a project about raising course evaluation rates via behavioral economics-inspired tweaks to marketing emails. Imagine if the email read:

Hey Peter, You’re part of a small minority of students who haven’t turned in their course evaluations yet. We want to make your classes less stressful and more interesting, so help us help you. The form will close in 2 days, so turn in your feedback now.

Finishing Weak 🔚

Whenever I write a review, the last fourth is always the hardest. I’m almost there, and I already have a lot of good content. Why not slack a little on notes for the last chapter? Ego depletion suggests that willpower is a limited resource that decays over time. It’s been studied extensively in lab tasks, with mixed conclusions. But what about similar tasks in the real world with a defined “endpoint?” Do book summaries get more sparse as the chapters go on? Do people document code less as they wrap up a project? Can we quantify when and how much people “slack” as they reach the finish line?s

Chess Dataset ♟️

It must be an underused dataset. Performance in chess games can tell us a host of things about persistance, general intelligence, and executive control. For example, how many lost games until someone gives up for the day? Is the hot hand a thing in chess? Did the general public get better or worse at chess after COVID-19 hit? What about after daylight savings?

Debater Careers 👔

What exactly do high school debaters do post-graduation? Does how they debated inform their career trajectories? Take one or more year’s worth of TOC entries. Segment them by debate type (PF/LD/Policy) and style (philosophy, policy, critical, tricks, flex). Use LinkedIn to identify their college major and internship/full-time industry. If the K debaters have it right, then LARPers/PFers should work more in finance/consulting etc. We can compile a table and make a Sankey diagram.

This is now reality!

Market (Over?)sizing 🧺

We know that investors are overconfident, and we know how much is lost due to excessive trading. What about market researchers, who are one step away? One type of study they do is market sizing: estimating, among other things, the total available market (TAM) for a given good or service. We might expect them to be too bullish, betting that a market is bigger than it really is. But validating such an estimate is harder - many times, the TAM is never realized. Here’s how we could empirically test for overconfidence: Take an industry like education. Segment it by different features - by grade level, online vs. hybrid, public vs. private, etc. Look for estimates of both overall TAM and TAM for each segment. Compare these two when added up. As Don Moore might suggest, people who zero-in on their specific case and don’t take the outside view may end up being biased.

Bikeshare Surge Pricing 🚲

Ride share programs are well-known for using surge pricing to adjust demand and supply across time. Bike sharing apps like Lyft should do the same, except based on location instead of time. When I try to bike around campus, there’s an annoying imbalanced of bikes. On the west of campus, there’s an abundance of bikes. On the east side, empty docks abound. The reason? The campus slopes down from east to west, so one way is a speedy, smooth glide, while the other is a pain in the ass. Lyft should price based off of location and altitude so that there’s bikes at Haas that I use to fly home.

Bookmark Archive 🔖

Several people I know, myself included, make a folder on Chrome’s bookmarks bar called “Archive.” Whenever I stop using a set of bookmarks (say, for a short-term project), I stash it away in the archive folder. But this gets pretty annoying. You have to search an unordered dropdown and carefully hover over subfolders. There are currently ways to export and import bookmarks. Why not add a dedicated bookmark archive that let’s people easily organize and find old bookmarks?

Urban Dictionary Names 📛

The Urban Dictionary name trend has taken over my Instagram. I noticed an interesting pattern: while most names have flattering definitions some names have neutral/negative descrptions. Moreover, it seems that more popular names (like Lauren) have more positive defitions than less popular names (like Xaviere). I suspect that it’s because the page for a name like Lauren will get more visits from people named Lauren, who drive up upvotes for positive definitions. We could use this database of first names find relationships between name popularity (and racial demographics) and the sentiment of the Urban Dictionary definition.

Debater Overconfidence 🤭

The Dunning-Kruger effect predicts that people with the least skill will be the most overconfident. Is this true of debaters? At debate tournaments, I’ll often hear both sides walk away thinking that they’ve “definitely picked up.” It happens especially often in mid- or lower-level debates. An experiment could survey debaters right after their rounds, asking them to estimate their chance of winning. We could also test the effect of a monetary incentive: if $5 is on the line, do they become more accurate? If they do, is there an implied “price of self-image”?

Zipcars and Speeding 🏎️

Zipcar operates on a reservation system, which means drivers have to book a time frame in advance. One night, I could only book 90 minutes for a trip to Oakland. What would’ve been a smooth trip became rushed after I had to stop to recharge my phone. I ended up heading back with only 20 minutes left. Although I knew better than to speed, I probably subtly went a little faster than I normally would. Are there more accidents and traffic violations towards the end of a reservation? Let’s get in contact with Zipcar and offer to do the analysis for them!

Chipotle’s Cup Fee 🥤

I fill my water cup with soda. Chipotle doesn’t like it. Yet, they recently decided to start charging ¢25 for water cups. But, as explored in the first chapter in Freakonomics, fees can often serve as moral license, crowding out intrinsic motivations. For those who didn’t steal soda on principle, the fee may empower them to abandon their moral riteousness. Since Chipotle supposedly studies soda-stealing, this is a question we could answer.

Scoots and Steps 🛵

I used to be an adventurous one. I walked 10,000+ steps every day to get around Berkeley. Ever since I got my scooter, my daily steps have plummetted to somewhere around 2,000-3,000 steps. The Mayo Clinic recommends 10,000 steps a day for healthy adults, so the substitution may actually have tangible health impacts. With nearly everyone keeping track of their own daily steps, could we gather data points from new scooter owners to measure the decrease in walking associated with buying a scooter?

GYST DeCal 💩

A class about getting your shit together. Just the gyst on: health, personal finance, productivity, socializing, professional development, studying, and mental wellness. Brief guest lectures from professors and professionals. Practical, take-home guides for implementing advice. Grades based on measured outcomes: Did you make your calendar? Did you actually track your personal finance? Have you meditated in the past 3 days?

Peter Zhang