Route Ratings Changed to Consensus

Christian wrote:Even w cutting out the outlier 13b and weighting the original rating more when there's less than 5 votes, how do the ratings below work out to a "median" of 5.8? It seems like in these cases, the effect of the algorithm is not "overweight the original", but more like "take the original, period"? Perhaps a a case where a mean would be better? Don't remember the bolt spacing, but from other comments, I'm not sure 5.8 leaders being super confident about getting on this route is necessarily a good thing.
Just to make sure everyone understands the algorithm: 5.8 was the original submission, and we count it extra times. So the median of:


is either 5.8 or 5.9 depending on how you round.

I have deleted the 5.13b, which appears to be an attempt to move the consensus up. Bottom line: I am sure you are right that there are MANY cases where, at a given point in time, the MEAN would be better than the MEDIAN, but out of 130,000+ routes, we can't manually pick and choose which routes get which system. This will fix itself as more people weigh in and move the consensus.
Thanks for the explanation, Nick.

Just out of habit, I always think of terms of weighted averages, so I was having trouble picturing how the extra weighting works with a median. Even the fact that the original rating counts as if 3 people gave it that rating was news to me. Could have been 2 or 4 or 5 for all I knew.

Christian wrote:It seems like in these cases, the effect of the algorithm is not "overweight the original", but more like "take the original, period"?
Yes and no. The effect of the algorithm is that the original grade sticks until at least 3 other people have weighed in. If the weighting is in effect only when there are less than five people suggesting grades, then the algorithm is really "original grade or unweighted median" except when there are four suggestions ... and then it depends on whether the median rounds up or down.

The additional rating suggestions eventually have an effect, but only once enough people have weighed in.
Nick, can you query the data to find out what the overall trend has been? I'm interested to see if the trend was up or down, and how substantially it went up/down over the last month.

I thought the 3x weighting for the original dropped out entirely after 5 people weighed in, but it appears it's in effect regardless of how many people weigh in, and if there's an even number of ratings, it always rounds towards the original (could also be "always rounds down", I suppose?)

For example this route had 10 ratings, not including my 11- rating added last. It was showing 10+ as consensus before my rating, which implies Brigitte's rating was still counting for 3 and that w those 12 ratings and the median between 10+ and 11-, it was rounding to 10+.

After my 11-, there were 13 ratings (w Brigette's counting for 3) and a clear median of 11-, and indeed the consensus changed to 11-.

"How do you like them apples"

Tom Nyce wrote:My preference is that rating within an area should be as consistent as possible. Those climbs are all in one guidebook, and there shouldn't be 5.8's that are harder than some other 5.9's for instance. When visiting climbers come to the area, they tackle some climbs cautiously to get used to the local rock type and ratings, and then can count of the guidebook consistency after that. When you allow a "consensus" of climbers (often including tons of visiting climbers, not used to that particular style of climbing) to chime in on the ratings, some climbs get changed (to match climbs in totally different areas where the climber is from). But, not all of them get changed, and the "consistency" of the ratings in that area suffer. Due to this effect, I've found that the older guidebooks to an area are often more internally consistent than the newer ones. Of course the older books have generally stiffer ratings, but they don't seem to have the unpredictable scatter that the newer books have. I'm talking about trad climbing rather than sport, because the local rock types make such a difference in the style required, and that is mainly what I have experience with. Of course, I'm not opposed to fixing up some true "sandbags," or over-rated climbs, that are not rated consistently with the other climbs in that same area.
I agree. I don't mind getting my ass kicked in a new area. And I try to grade things in places only when I've climbed there some. Different places, different styles of climbing, rock, and the grades vary. Grades are BS anyway. I do think it's important not to drag ones idea of what a grade is to a different area. And it's equally as important not to assume just because I can't do this, it's super hard, or because I can it's so easy. Grades are about averages or perhaps means - not your exact experience. Basically, grade with humility. You very well could just be lacking the requisite technique or be really good at that technique. I've often this true for me. I will admit that especially as a new climber I suffered from this 'anchoring bias' - but I'm not sure that was a bad thing. I hadn't climbed enough then to really understand what was what. With more experience I'm far freer to grade things free a peer pressure.
Nice reverse engineering, Christian!

With weighting in effect more than for just four opinions, that's incentive for more people to weigh in (and incentive to get there first).

As for always rounding down, apparently Nick is just a hardman! (jk)

nkane wrote:I'll be interested to see which way grades move with this change, both now and over time. I suspect the trajectories will be upward.
New to the sandbag game huh?
I don’t know if this is working as intended. I put up a 10c a bit ago. Someone climbed the adjacent route, got their ID on the route wrong and down graded it to 5.9. Now the site calls it 9+!  Some unsuspecting 5.9 climber has a surprise coming!! On the positive side maybe I’ll collect some booty! 

If they definitely made a mistake, post a link to that route here and I'll delete the error.

I noticed the other day that Royal Arches has finally moved to 5.10

