1. Documents
Documents include the following:
Written texts: diaries, letters, e-mails, SMS texts, internet pages, novels, newspapers, school reports,
government reports, medical records, parish registers, train timetables, shopping lists, financial records,
graffiti etc.
Other texts: paintings, drawings, photographs, maps and recorded or broadcast sounds and images
from film, TV, music, radio, home video etc.
Sociologists make use of the following types of documents:
Public documents: produced by organisations such as government departments, schools, welfare
agencies, business and charities. They include Ofsted reports, council meeting minutes, media output,
published company accounts, records of parliamentary debates and the reports of public enquiries
such as the Macpherson Inquiry into the death of the teenager Stephen Lawrence.
Personal documents: include items such as letters, diaries, photo albums and autobiographies.
These are first-person accounts of social events and personal experiences and often include the writer’s
feelings and attitudes.
Historical documents: are simply personal or public documents created in the past.
Practical issues and documents
Documents have several advantages for the researcher:
They may be the only available source of information, for example in studying the past.
They are a free or cheap source of large amounts of data, because someone else has already gathered
the information
For the same reason, using existing documents saves the sociologist time.
It is not always possible to gain access to them.
Individuals and organisations create documents for their own purposes, not the sociologist’s.
Therefore, they may not contain answers to the kinds of questions that the sociologists wish to ask.
Theoretical issues and documents
The choice of whether and how to use documents depends on the methodological perspective of the
sociologist. While interpretivists often favour the use of documents, positivists tend to reject them other than
as material for content analysis. Positivists regard them as unreliable and unrepresentative. However, they do
sometimes carry out content analysis on documents as a way of producing quantitative data.
Validity
Interpretivists’ preference of documents is due to the fact they believe it creates a valid picture of the actor’s
meanings. The qualitative data can enable the researcher to come closer to their reality.
Example: The Polish Peasant in Europe and America, William Thomas and Florian Znanieki’s (1919)
interactionist study of migration and social change, used a variety of documents. These included 764 letters
bought after placing an advertisement in a Polish newspaper Chicago, several autobiographies, and public
2. documents such as newspaper articles and court and social work records. They used the documents to reveal
the meanings individuals gave to their experience of their migration.
Also, because documents are not written with sociologists in mind, they aren’t likely to be as appropriate to
research as they are not made with a certain question in mind.
However, documents may lack validity in some instances as a source of data. John Scott (1990) identifies
three reasons for this. Firstly, a document can only yield valid data if it is authentic- if it is genuinely what it
claims to be. Researchers may not be certain about if it has been forged.
Secondly, there is an issue of credibility. Is what the document says believable? For example, politicians may
write diaries or autobiographies that gloss over their mistakes, to make them appear more serving. A
document may also lack credibility if it was written long after the events took place. This may mean that the
writer forgets certain key events.
Thirdly, while interpretivists value documents because they give us access to their authors’ meanings, there is
always a danger of the sociologist misinterpreting the data. Also, if the document is in a foreign language,
when it is translated, words may become confused with others, and if it was written many years in the past,
words may mean something different.
Reliability
Positivists regard documents as unreliable sources of data. Unlike official statistics on a topic, which are
gathered together in a standard format according to fixed criteria that allow us to compare them, documents
are not standardised in this way. E.g. every person’s diary is unique, so it is difficult to generalise them. An
example of this could be people’s experience of war. Some would have been on the frontlines and
experienced death, some may have got injuries themselves and they would all have a personal experience and
opinion of their event.
Representativeness
Documents may also be unrepresentative for other reasons. As Scott notes, some groups may not be
represented in documents. E.g. the illiterate and those with limited leisure time are unlikely to keep diaries. In
addition, the evidence in the documents that we have access to may not be typical of the evidence in other
documents that we don’t have access to. For example:
Not all documents survive: are the surviving ones typical of the ones that get destroyed or lost?
Not all of the documents are available. The 30- year rule prevents access to many official documents
for 30 years: if classified as official secrets, they will not be available at all. Private documents such as
diaries may never become available.
If we cannot be sure that the information we extract from the documents is representative, we cannot safely
generalise from it.
Content analysis
Content analysis is a method for dealing with the contents of documents, especially for those produced by the
mass media. It has been used to analyse news broadcasts, advertisements, children’s reading schemes,
newspaper articles and so on. There are two main types of content analysis:
3. Formal content analysis
Thematic analysis
Formal content analysis
Although documents are normally regarded as a source of qualitative data, formal content analysis allows us to
produce quantitative data from then. Ros Gill (1988) describes how formal content analysis works as
follows:
Imagine we want to measure particular aspects of a media message- for example, how many female characters
are portrayed as being paid employment in women’s magazine stores.
First we select a representative sample of women’s magazine stories- for example, all the stories- for
example, all the stories appearing in the five most popular magazines during the last six months.
Then we decide what categories we are going to use, e.g. full-time housewife.
Next, we study the stories and place the characters in them into the categories we have decided upon.
This is called coding.
We can quantify how women are categorised in the stories simply by counting up the number in each
category.
Formal content analysis is attractive to positivists because they regard it as producing objective, representative,
quantitative data from which generalisations can be made. It is also a reliable method because it is easy for
others to repeat and check the findings. Repeating studies also allows us to identify trends over time, for
example to see if media images of a group have changed.
Formal content analysis has also proved attractive to feminists in analysing media representations of gender.
For example, Lesly Best (1993) analysed gender roles in children’s reading schemes, while Gaye
Tuchman (1978) analysed television’s portrayal of women. Both found that females were portrayed in a
limited range of stereotyped roles.
However, interpretivists criticise formal content analysis for its lack of validity. They argue that simply counting
up how many times something appears in a document tells us nothing about its meaning.
The method is also not as objective as positivists claim. For example, the processes of drawing up the
categories and deciding in which one to place each case are subjective processes involving value judgements by
the sociologist.
THEMATIC ANALYSIS
This is a qualitative analysis of the content of media texts and has been used by interpretivists and feminists. It
usually involves selecting a small number of cases for an in-depth analysis. The aim is to reveal the underlying
meanings that have been ‘encoded’ in the documents, as a way of uncovering the author’s ideological bias.
E.g. working from a feminist perspective, Keith Soothill and Sylvia Walby (1991) made a thematic analysis of
the ways newspapers reported rape cases.
However, thematic analysis can be criticised on several grounds.
It does not attempt to obtain a representative sample, so its findings cannot be safely generalised to a
wider range of documents.
There is often a tendency to select evidence that supports the sociologist’s hypothesis rather than
seeking to falsify it, which Karl Popper argues is unscientific.
4. There is no proof that the meaning the sociologist gives to the document is the true one. E.g, post
modernists would argue that there is no fixed or ‘correct’ meaning to a text and that the sociologist’s
reading of it is just one among many.
Content analysis, whether formal or thematic, has practical advantages: it is cheap, and it is easy to find sources
of material in the form of newspapers, television broadcasts etc. however, in both formal and thematic analysis,
coding or analysing the data can be very time-consuming.
Summary
Secondary sources are those created by someone other than the sociologist who is
using them. They include official statistics and documents.
They save the sociologist time and money.
They provide useful data that they sociologist may not be able to gather
themselves, such as desired samples who have passed away etc.
Positivists regard this as representative and reliable.
Statistics, especially ‘soft’ ones may lack validity.
Interpretivists see official statistics as social constructs.
Marxists and feminists regard them as ideological.
Sociologists use both personal and public documents, such as diaries, letters,
media output and government reports. Documents may give insight into the
meanings of those who created them.
They may not be authentic or representative.
Sociologists may apply formal or thematic content analysis to documents.