Mountain Project Logo

Lies, Damned Lies, and Statistics

Anonymous · · Unknown Hometown · Joined unknown · Points: 0

Just don't try to use something like mountain project for your math please. It isn't going to be anything better than a shot in the dark because it wasn't designed to force people to be honest and human's number one quality is lying.

SteveF · · Fort Collins, CO · Joined Aug 2007 · Points: 32

The site owners say scraping violates MP's terms of service: Scraping data from MP. Although in an application like this they probably wouldn't care.

Another possible approach might be to iterate through all object ID numbers with the stats php script, and identify which objects are routes. Then collect their ratings.

J Q · · Unknown Hometown · Joined Mar 2012 · Points: 50
ViperScale wrote:Just remember whatever data you pull from a site like this is wrong and needs to be lowered by a few grades. It is human nature that most humans are alot more likely to put they did something on a site like this than to have really done it. I know plenty of people I have seen on her that i know irl that have not climbed the stuff they say they have.
Take that MP! Now Just do it is a 5.14a, the snake in the grass has spoken. You now suck, and your project was unworthy.
teece303 · · Highlands Ranch, CO · Joined Dec 2012 · Points: 596

“Just don't try to use something like mountain project for your math please. It isn't going to be anything better than a shot in the dark because it wasn't designed to force people to be honest and human's number one quality is lying.”

Um... This sentiment is really silly. It's expressed more than once in this thread.

What should one do? Use the Grade-o-meter 5000 to collect grade data on climbs?

Any human-reported data set will have its problems. So? It's better than the alternative data set: which is a set pulled from one's ass.

The problem of falsely reported grade info is real, but nowhere near as bad as some are making it out to be.

Do all the math you want with MP data. It's your best and only source if you want to quantify climbing grades.

Anonymous · · Unknown Hometown · Joined unknown · Points: 0

How about just climb for fun and ignore the rest of the crap? Is it that hard? I only look at grades to avoid getting over my head, next to that they don't mean anything to me.

michael s · · Denver, CO · Joined Apr 2012 · Points: 80
ViperScale wrote:Just don't try to use something like mountain project for your math please.
  • shakes fist* "You kids and your math"
Reggie Pawle · · Boston, MA · Joined Nov 2010 · Points: 5
teece303 wrote:“Just don't try to use something like mountain project for your math please. It isn't going to be anything better than a shot in the dark because it wasn't designed to force people to be honest and human's number one quality is lying.” Um... This sentiment is really silly. It's expressed more than once in this thread. What should one do? Use the Grade-o-meter 5000 to collect grade data on climbs? Any human-reported data set will have its problems. So? It's better than the alternative data set: which is a set pulled from one's ass. The problem of falsely reported grade info is real, but nowhere near as bad as some are making it out to be. Do all the math you want with MP data. It's your best and only source if you want to quantify climbing grades.
this. this is the most boring discussion I've ever seen. of course the data is shit. all real world data is shit. the fun part is making it usable, figuring out how to account for biases. but it's all completely academic unless you actually have the data.

speaking of which: I'm not terribly worried about copyright infringement because I have no intent of sharing the data except on mountainproject. I'm not sure if this violates the copyright of the original contributors though. don't want to be a dick about that, but I don't think it's a problem. again, academic until the data is generated.

speaking of which: I actually am concerned about sending too many requests to mountainproject's servers. I really don't want to crash the site, dick move. iterating through all possible object IDs? oof. most route ids look like 105xxxxxx, so that's 1m requests. some route ids start with 106, so maybe 2m requests? if mountainproject gets 50k visits a day, that'd be a pretty big spike. I'd just feel bad. my rudimentary scraper solution would still be significant though. if anything I'd go the scraper route, then do 1000 random hits to see how many routes the scraper misses.

if anyone wants to, you know, offer any substantial help or ideas that'd be cool. or we could just keep talking about how bad the data probably is, that's cool too.
Michael Brady · · Wenatchee, WA · Joined Jul 2014 · Points: 1,316

Or we could climb and not worry about this "problem"

teece303 · · Highlands Ranch, CO · Joined Dec 2012 · Points: 596

I understand that doing stats on MP route data might not be your thing. I have a degree in applied math, though, so it might be *my* thing.

But no, playing with route data and climbing are not mutually exclusive. And no, it's not chasing grades. And no, it does not preclude one from having fun climbing. Lighten up. This thread isn't for you.

If it doesn't interest you, move along, for Pete's sake. Statistical analysis is fun for a small subset of the population. An even smaller population enjoys stats AND climbs. Why not combine the two?

Michael Brady · · Wenatchee, WA · Joined Jul 2014 · Points: 1,316
teece303 wrote:I understand that doing stats on MP route data might not be your thing. I have a degree in applied math, though, so it might be *my* thing. But no, playing with route data and climbing are not mutually exclusive. And no, it's not chasing grades. And no, it does not preclude one from having fun climbing. Lighten up. This thread isn't for you. If it doesn't interest you, move along, for Pete's sake. Statistical analysis is fun for a small subset of the population. An even smaller population enjoys stats AND climbs. Why not combine the two?
Well...what fun would that be. Poking fun at people dissecting the difference between whether it is 10a or 10b, arguing about where a highball starts or debating whether soloing is responsible is MY thing. I apologize. I do have a lot of respect for your mathematical interests, I just feel they could be much more well used.
EDub · · Unknown Hometown · Joined Dec 2011 · Points: 0

Twinkie is most definitely a 12a

joshf · · missoula, mt · Joined Oct 2007 · Points: 790

perhaps grades are subjective...? Also, statistics is just learning how to lie well, the only truth is in personal experience.

Mike Skaug · · Boulder, CO · Joined Feb 2012 · Points: 15

Hey Brian,

I've been using BeautifulSoup , which is a python library, to get data from MP.

If you want, I could help you get the data, so you could do the analysis on the full data set, or more complete subsets. Let me know.

--Mike

Brian Adzima · · San Francisco · Joined Sep 2006 · Points: 560

Thanks mike and reggie. Electronically parsing the site was something I was curious about and was not sure how to get started (especially without breaking something). I'll definitely appreciate the suggestions, and will take a look...when I get some more time for this project. Unfortunately I have a day job trying that also involves generating questionable data and wondering if there is any meaning in it.

michael s · · Denver, CO · Joined Apr 2012 · Points: 80
Brian Adzima wrote:Unfortunately I have a day job trying that also involves generating questionable data and wondering if there is any meaning in it.
Good on ya for thinking about this!

Three quotes that a friend shared with me not too long ago:

"To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of"
-Sir Ronald Aylmer Fisher

"The plural of anecdote is not data."
-Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
-John Tukey

And one by me along the same lines:
"If the data ain't legit, you can't say shit"
Guideline #1: Don't be a jerk.

General Climbing
Post a Reply to "Lies, Damned Lies, and Statistics"

Log In to Reply

Join the Community

Create your FREE account today!
Already have an account? Login to close this notice.

Get Started.