Factors to Consider When Choosing Accounts Payable Services Providers.pptx
From Before the Cradle: mapping online debates on c-section and family planning
1. Even Before the Cradle
mapping online debates on
c-section and family planning
Tommaso Venturini
tommaso.venturini@sciencespo.fr
2. An example of a medium-size
project in digital methods
1 expert partner:
2 research partners:
Resources (@médialab):
- Donato Ricci (designer - 3 months full time)
- Audrey Baneyx (developer - 3 weeks )
- Support from the médialab team
Rationale for the project
The WHO issues recommendations on health related practices to medical
institutions and to the public opinion and need ways to monitor the spread and
efficacy of such recommendations
4. Actual research protocol
1. Briefing with the issue experts
2. Draft of possible mapping approaches (by students)
3. Choice of datasets and methods with the issue experts
4. Data extraction and cleaning
5. Data treatment
6. Exploration and definition of specific research questions
7. Sketch of data visualizations
8. Meeting between data experts and designers
9. Data refinement
10. Development of the visualizations
11. Interpretation of the visualizations with the issue experts
12. Development of the public atlas
5. What we have done so far
1. Briefing with the issue experts
2. Draft of possible mapping approaches (by students)
3. Choice of datasets and methods with the issue experts
4. Data extraction and cleaning
5. Data treatment
6. Exploration and definition of specific research questions
7. Sketch of data visualizations
8. Meeting between data experts and designers
9. Data refinement
10. Development of the visualizations
11. Interpretation of the visualizations with the issue experts
12. Development of the public atlas
6. 1. Briefing with
the issue experts
• July 2012
• Meeting with 2 experts from the WHO
• Mario Merialdi(Coordinator of Coordinator Reproductive Health and
Research Family, Women's and Children's Health)
• Ana PilarBetranLazaga(Assistant coordinator…)
• 1 afternoon (at the médialab) presentation on digital methods
and communication design
11. 3. Choice of
datasets and methods
• 1 day meeting in Geneva
• Presentation of the students’ draft-maps
• Identification of methodological problems
• Choice of the two case studies
• After 20 days, list of precise research questions
(by the experts)
• After 20 days, definition of the operationalization (with the
experts)
12. Two case-studies
Caesarean section (also C-section, Cesarean section) “a
surgical procedure in which one or more incisions are
made through a mother's abdomen (laparotomy) and
uterus (hysterotomy) to deliver one or more babies”
(Wikipedia 05/07/13)
Family planning
“the planning of when to have children and the use of birth
control and other techniques to implement such plans”
(Wikipedia 05/07/13)
14. 3. Choice of datasets and methods
C-section
1. Websites hyperlinks
- cartography of the topology of the hyperlink connections
- analysis of the penetration of the WHO messages
2. Websites texts
- analysis of the expressions used by the different type of websites
3. Online images
- analysis of the representations used by the different type of websites
4. Discussions in AuFeminin
- analysis of the agenda of the online discussion
Family Planning
1. Websites texts
- analysis of the expressions used by the different type of websites
2. Wikipedia
- Analysis of the edit history of the pages on family planning
15. 3. Choice of datasets and methods
C-section
1. Websites hyperlinks
- cartography of the topology of the hyperlink connections
- analysis of the penetration of the WHO messages
2. Websites texts
- analysis of the expressions used by the different type of websites
3. Online images
- analysis of the representations used by the different type of websites
4. Discussions in AuFeminin
- analysis of the agenda of the online discussion
Family Planning
1. Websites texts
- analysis of the expressions used by the different type of websites
2. Wikipedia
- Analysis of the edit history of the pages on family planning
16. 4. Data extraction
and cleaningC-section
1. Websites hyperlinks
- Query de-personalized Google with various queries
- Harvest the first 100 results
- Manually clean the results (366 seed-URLs)
2. Websites texts
-Harvest the source-code of the 366 seed-URLs and extract the textual content
3. Online images
- Query de-personalized Google +co.ok, +fr, +it, +com with translated queries
- Harvest the first 100 results for each query and engine (about 4.000 img)
- Harvest the source-code of the URLs containing the images
- Automatically extract the textual content
4. Discussions in AuFeminin
- Select some the forums
- Search all the discussions containing the translated queries
- Harvest the discussions of last year with at least 2 replies
- Automatically extract the textual content
Family Planning
1. Websites texts
- Query the de-personalized Google.com with various queries
- Harvest the first 100 results
- Manually clean the results (553 seed-URLs)
- Harvest the source-code of the 553 URLs and extract the textual content
2. Wikipedia
- Select the most relevant Wikipedia pages related to family planning
- Extract the complete edit history of the page via the Wikipedia API
17. 4. Data extraction
and cleaningC-section
1. Websites hyperlinks
- Query de-personalized Google with various queries
- Harvest the first 100 results
- Manually clean the results (366 seed-URLs)
2. Websites texts
- Harvest the source-code of the366 seed-URLs and extract the textual content
3. Online images
- Query de-personalized Google +co.ok, +fr, +it, +com with translated queries
- Harvest the first 100 results for each query and engine (about 4.000 img)
- Harvest the source-code of the URLs containing the images
- Automatically extract the textual content
4. Discussions in AuFeminin
- Select some the forums
- Search all the discussions containing the translated queries
- Harvest the discussions of last year with at least 2 replies
- Automatically extract the textual content
Family Planning
1. Websites texts
- Query the de-personalized Google.com with various queries
- Harvest the first 100 results
- Manually clean the results (553 seed-URLs)
- Harvest the source-code of the 553 URLs and extract the textual content
2. Wikipedia
- Select the most relevant Wikipedia pages related to family planning
- Extract the complete edit history of the page via the Wikipedia API
18. 4. Data extraction and cleaning
C-section
1. Websites hyperlinks
- Query the de-personalized Google.com with various queries
(C-section, Cs delivery, Surgical delivery, Abdominal delivery,
Cesarean delivery, Caesarean delivery, Operative delivery, Caesarean, Cesarean)
- Harvest the first 100 results of each query
- Manually clean the results (366 seed-URLs)
2. Websites texts
- Harvest the source-code of the366 seed-URLs
- Extract textual content (through Url2Text)
19. 5. Data treatment
C-section
1. Websites hyperlinks
- Crawl the 366 seed-URLs through Hyphe
(https://github.com/medialab/Hypertext-Corpus-Initiative/)
- Manually and automatically clean the neighbors URLs (614 URLs)
- Extract the hyperlink networks through Hyphe
2. Websites texts
- Categorize the the 366 seed-URLs
Theme: Pros, cons and worries, involvement of the father, Ethical issues
Type: Health care providers, Government, Medical and scientific groups, Ngos / no-profit, Feminists
groups / female associations / moms, Rights groups, Natural & holistic delivery groups, Media, Blogs&
Discussions, Institutions, Hospitals & Clinics, Products
- Extract the noun-phrases through Pattern (www.clips.ua.ac.be/pattern)
- Cluster and clean the noun-phrases (through Google Refine)
24. 5. Data treatment
C-section
1. Websites hyperlinks
- Crawl the 366 seed-URLs through Hyphe
(https://github.com/medialab/Hypertext-Corpus-Initiative/)
- Manually and automatically clean the neighbors URLs (614 URLs)
- Extract the hyperlink networks through Hyphe
2. Websites texts
- Categorize the the 366 seed-URLs
Theme: Pros, cons and worries, involvement of the father, Ethical issues
Type: Health care providers, Government, Medical and scientific groups, Ngos / no-profit, Feminists
groups / female associations / moms, Rights groups, Natural & holistic delivery groups, Media, Blogs&
Discussions, Institutions, Hospitals & Clinics, Products
- Extract the noun-phrases through Pattern (www.clips.ua.ac.be/pattern)
- Cluster and clean the noun-phrases (through Google Refine)
33. 6. Exploration
C-section
1. Websites hyperlinks
- Visual network analysis in Gephi(gephi.org)
- Egocenter heatmaps in Heatgraph
(tools.medialab.sciences-po.fr/heatgraph/)
2. Websites texts
- Language intake analysis in Sven (sven.densitydesign.org)
34. SVEN
http://sven.densitydesign.org
A) Pros, cons and worries
B) Involvement of the father
C) Health care providers
D) Government
E) Ethical issues
F) Medical and scientific groups
G) Ngos / no-profit
H) Feminists groups / female associations / moms
I) Rights groups
J) Natural & holistic delivery groups
K) Media
L) Blogs& Discussions
M) Institutions
N) Hospitals & Clinics
O) Products
35. Other datasets and methods
C-section
1. Websites hyperlinks
- cartography of the topology of the hyperlink connections
- analysis of the penetration of the WHO messages
2. Websites texts
- analysis of the expressions used by the different type of websites
3. Online images
- analysis of the representations used by the different type of websites
4. Discussions in AuFeminin
- analysis of the agenda of the online discussion
Family Planning
1. Websites texts
- analysis of the expressions used by the different type of websites
2. Wikipedia
- Analysis of the edit history of the pages on family planning
36. What remains to do
1. Briefing with the issue experts
2. Draft of possible mapping approaches (by students)
3. Choice of datasets and methods with the issue experts
4. Data extraction and cleaning
5. Data treatment
6. Exploration and definition of specific research questions
7. Sketch of data visualizations
8. Meeting between data experts and designers
9. Data refinement
10. Development of the visualizations
11. Interpretation of the visualizations with the issue experts
12. Development of the public atlas
37. What we have learned
1. Digital methods are not easier or quicker
2. More data always entails more noise
3. Results quality depends heavily on data cleaning
4. No a priori distinction exists between noise and information
5. An iterative approach is necessary
6. Exchanges with experts and expertise building are necessary
7. Digital methods are a form of field work
38. tommasoventurini.it
Venturini, T. (2012). Great expectations: méthodes quali-quantitative et
analyse des réseaux sociaux.
In J.-P. Fourmentraux (Ed.), L’Ère Post-Media. Humanités digitales et
Cultures numériques (Hermann., Vol. 104, pp. 39–51). Paris.
Venturini, T., & Latour, B. (2010). Le tissu social : trace numérique et
méthodes quali-quantitatives. Proceedings of Future En Seine 2009. Paris:
Editions Futur en Seine.
Latour, B., Jensen, P., Venturini, T., Grauwin, S., &Boullier, D. (2012). “The
WholeisAlwaysSmallerThanIts Parts” A Digital Test of Gabriel
Tarde’sMonads. British Journal of Sociology, 63(4), 591–615.