Journalists can use Google's advanced search operators to more effectively mine information from social networks than searching the social networks directly. Some key operators include "site:" to restrict a search to a specific domain, "inurl:" to search for terms within URLs, and "intitle:" to search for terms within page titles. Using these operators along with relevant search terms allows journalists to efficiently search profiles and pages on networks like Bebo, Friendster, LiveJournal, MySpace, LinkedIn and uncover new leads, case studies, and expert sources for their reporting.
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Journalists and the Social Web 1
1. Web2.0 and 3.0, Social networks and Journalists Mining information from social networking sites Journalists are increasingly turning to social networks to look for case studies, contacts and expert opinion. But searching social networks can be frustrating and time consuming.
2.
3. Google’s advanced operators Google allows various ‘advanced’ operators. These are typed directly into the Google Search field. Used correctly and with care, they can be far more effective than using the ‘advanced’ search page.
4. A crash course in advanced operators: That search will look for pages that include the terms ‘patient’, ‘help’ and MRSA - but only in the UK’s Health Protection Agency
5. The operator ends with a colon, and then no space. These are the most important operators for what we are discussing today. site: www.hpa .org.uk - restricts the search only to HPA pages inurl: privacy - restricts the search only to pages that have the word ‘privacy’ in the url intitle: semantic - restricts the search only to pages with ‘semantic’ in the title of the page
6. link: www.hpa .org.uk - restricts the search only to pages that link to the HPA filetype: pdf - restricts the search only to pdf documents allintitle :privacy research - will return pages that have both ‘privacy’ and ‘research’ in the title. info: www.hpa .org.uk accesses Google information about that site such as similar sites and site that link to it Advanced operators are extremely powerful and can be used to access information on website servers for example.
7. The technique varies depending on the social network We’ll start by looking at bebo.com Bebo profiles usually have a url that looks like this: http://www.bebo.com/Profile.jsp?MemberId=xxxxxxx The url normally contains the terms: ‘profile’ and ‘memberid’ Using google’s advanced operators we can include these terms in our search strings to search only within bebo profiles.
8. This search string: site:bebo.com inurl:memberid inurl:bebo Returned around 34 million hits in October 2008
9. Imagine you are looking for people who work for Pfizer in bebo.com Search the bebo.com site and you’ll get around 85 hits
10. But Search in google using this string: site:.bebo.com inurl:memberid inurl:bebo pfizer And you get 1940 hits.
11. And many of those include open profiles from people who work for Pfizer
12. Search for the term “ tomb-stoning” in bebo and you get 3 people.
13. But use this string in google: And you get 98 profiles of people who claim to ‘tomb stone’ site:.bebo.com inurl:memberid inurl:bebo “ tomb-stoning”
14. Friendster.com requires a login to search profiles and within the Friendster pages But you can get around this barrier using search engine operators combined with other search terms....
15. Returns nearly 7 million hits. For example, this search: inurl:profiles inurl:friendster
16. Searching within those results for ‘Oslo’ Initially, google only returns 2 results. When google hides many ‘similar pages’ you need to ‘repeat the search with omitted results included’
17. When we do that, google returns 2,260 profiles from people in Oslo or who mention Oslo
18. Livejournal You can search LiveJournal communities and members via the ‘explore’ page. For example, imagine you are writing a story about the hospital acquired infection - MRSA. You can search for ‘mrsa’ in the liverjournal search field.
19. And we get 3 matches for communities interested in MRSA. And 18 matches for users.
20. LiveJournal ‘community’ pages normally have URLs structured like this: http://community.livejournal.com/zen_within/ And LiveJournal ‘user’ pages normally have URLs structured like this: http://username.livejournal.com/XXXXX Using the same tactics as before, we can use Google’s advanced operators to search livejournal’s pages more effectively.
21. In October, this search: inurl:livejournal site:livejournal.com returned more than 55 million hits. And this search: inurl:livejournal site:livejournal.com mrsa returned more than 2,480 hits.
22. You can go on refining your results using similar tactics. For example, this search: inurl:livejournal site:livejournal.com mrsa inurl:community returns 373 results for ‘community’ pages only.
23. Including the UKs ‘Cynical Nurse’ community: Where there is a thread on MRSA
24. Myspace Myspace profiles usually have a url that looks like this: http://profile.myspace.com/index.cfm?fuseaction=user.viewprofile&friendid=xxxxx The url normally contains the terms: ‘fuseaction’ and ‘viewprofile’ Using these terms to explore Myspace content using Google: site:myspace.com inurl:fuseaction Returns million 17 million hits in October 2008.
25. Use Google as an extra tool to search Myspace. For example, if you search for ‘MRSA’ under ‘people’ in Myspace, you get 49 profiles. But this search in Google: site:myspace.com inurl:viewprofile MRSA returns 2890 results.
26. Linkedin Linkedin is generally seen as the professional social network for business people. But it is very difficult to search or view any profiles unless you are a member
27. As a member, if I search for ‘Pfizer’ under ‘people’ I get only 20 hits
28. But here are some of the 290,000 hits I obtained using: site:linkedin.com pfizer in Google And many of those are Pfizer employees...
29. Here is the ‘public profile’ Pfizer’s Associate Director of Global Regulatory Affairs. This gives her current position, previous experience, education. Other profiles give interests. This ‘public’ listing can be found in Google when you enter specific names. But this technique allows you to search using company names or job titles etc.
30.
31.
32. With links to pro-ana websites, Potential case studies and anecdotes And other leads and links
33. Using: inurl:livejournal site:livejournal inurl:community pro-ana We can explore Livejournal’s community sites that are pro-ana or campaigning against pro-ana. Adding in other terms to narrow focus
34. We get 115 hits in Livejournal Community pages that mention London . Some of those are potential leads. By adding ‘London’ to the search string
35. Be flexible with these tactics Try different strategies with different social networks Hone your results by adding additional search terms Use Google’s ‘search within results’ option to drill down further
36. Using these tactics In May this year I set myself the target of: finding personal information related to somene under 16 years of age, someone’s precise location; and, personal information related to someone’s work. In 10 minutes I was able to find:
37. - the mobile number of a 15-year-old girl in South London; - the address of where a 17-year-old waitress works in Kent; and, - the e-mail address and salary of an Accenture employee. These kind of privacy blunders litter sites such as Bebo.com, Myspace.com and Facebook and the debate about how best to protect people from identity theft has intensified as social networking has exploded in popularity.
38. Related tactics prove so successful at reaching sensitive, personal information that journalism.co.uk wrote to the Press Complaints Commission. We are likely to do so again. See demonstration.