Mountain Project Logo

3rd parties scraping profiles from mountain project

Original Post
MP · · Unknown Hometown · Joined Sep 2013 · Points: 2

I was just thinking- what prevents someone from downloading all mp profiles from the site? 

Mike Gibson · · Payson, AZ · Joined Jul 2006 · Points: 0

The zero financial benefit.

Paul Wilhelmsen · · sandy, ut · Joined Aug 2012 · Points: 231

Listen to "Breach" on whatever platform you use to listen to podcasts. The stuff that you think of as unimportant can get hackers into your bank accounts. your voter registry.... literally anything if they are determined enough. A couple hundred million dead  accounts on yahoo literally may have subverted democracy in the US as we know it.

A Twist · · Ventura County, CA · Joined May 2017 · Points: 0
mpech wrote: I was just thinking- what prevents someone from downloading all mp profiles from the site? 

Out of curiosity, what do you mean by "downloading"?  What sort of information would the individual be gathering?

MP · · Unknown Hometown · Joined Sep 2013 · Points: 2
Andrew Tegley wrote:

Out of curiosity, what do you mean by "downloading"?  What sort of information would the individual be gathering?

Andrew- in your case- Name, location,profile picture, age, associated posts.

As others in the thread noted, it is a less rich dataset than, say, Facebook or LinkedIn, so the desire to scrape the data may be lower. It just seems weird that anyone that creates an MP account can access all other account profiles. 
FrankPS · · Atascadero, CA · Joined Nov 2009 · Points: 276

I just learned the word, "doxing." I think it applies to this concern the OP raises.

Bill Kirby · · Keene New York · Joined Jul 2012 · Points: 480

 I can see your point. My wife never posts pictures of vacations on Facebook until we return home. She worries someone could see you’re away and break in your house. Someone could create an account, check what people are ticking off and if it’s outta town drive by yo house.

 I’m always amazed at how easy it is to find out people’s information online.

J Squared · · Unknown Hometown · Joined Nov 2017 · Points: 0

you know.. you don't have to be signed into, or even a member of MP... to view someone's complete "profile"  (and it's not like either of these things are even a major hurdle for any determined AI)
therefore the data is already indexed by Google and every other internet spider.
it's already out there.  welcome to the internet.

Jack Quarless · · Unknown Hometown · Joined Feb 2011 · Points: 0

It's so surprising that anyone still uses their full real names. More surprising that some throw tantrums about people others not using the names on their social security card. Not surprising at all that these two groups heavily overlap and are mostly older people.

John Barritt · · The 405 · Joined Oct 2016 · Points: 1,083

It's probably already been sold....... ;)

@Jack (if that's your real name) not using your real name makes it easy to troll but is irrelevant in the data scheme. They know who you are.... ;)

J Squared · · Unknown Hometown · Joined Nov 2017 · Points: 0
John Barritt wrote: It's probably already been sold....... ;)

@Jack (if that's your real name) not using your real name makes it easy to troll but is irrelevant in the data scheme. They know who you are.... ;)

this is true!


it's not just facebook who develops "shadow profiles" on everyone they can find.
W Melon · · Unknown Hometown · Joined Apr 2018 · Points: 0

import requests
import re
from bs4 import BeautifulSoup

pp = pprint.PrettyPrinter(indent=4)
links = re.findall(r' mountainproject.com/forum/t…\d{9}.*(?=")',
                   requests.get(' mountainproject.com/forum/l…;).text)
userList = {}
for link in links:
    _html = requests.get(link)
    users = re.findall(r' mountainproject.com/user/\d{8,11}.*(?=">)', _html.text)
    for user in users:
        if user not in userList:
            soupSmall = BeautifulSoup( requests.get(user).text, "html.parser" )
            _raw = soupSmall.find("div", { "class" : "col-xs-12 text-xs-center" }).text
            _raw = re.sub('[^A-Za-z0-9 ]+', '', _raw)
            raw = re.sub('\W+', ' ', _raw)
            userList[user] = raw
pp.pprint(userList)
 

Guideline #1: Don't be a jerk.

Help
Post a Reply to "3rd parties scraping profiles from mountain pro…"

Log In to Reply
Welcome

Join the Community! It's FREE

Already have an account? Login to close this notice.