Made in Text Blog

7 Simple Techniques to Analyze Your Text for Better Writing

7 Simple Techniques to Analyze Your Text for Better Writing

Analyzing texts is a vital skill for improving writing. By examining different texts, you can learn a lot about structure, style, and content. This knowledge is key to enhancing your own writing. Understanding how authors construct their works gives you tools to develop your style. It’s like uncovering a roadmap to effective writing. By studying various texts, you gain insights that can transform your writing, making it more compelling and polished.

Understanding Text Structure

Recognizing text structures is crucial in writing. Each structure, be it narrative, expository, or persuasive, serves a different purpose. Understanding these can guide your writing approach. For example, a narrative structure focuses on storytelling, while expository aims to inform and explain.

One way to grasp these structures is through examples. Consider using a paper writing service like EssayPro to see samples. These services often provide well-crafted examples that illustrate different structures effectively.

When analyzing texts, look for clues. Narratives often use descriptive language and personal anecdotes, expository texts present facts and explanations, and persuasive writings argue a point with supporting evidence. Identifying these elements helps in applying them to your writing.

writing text analysis

Analyzing Writing Style

Analyzing an author’s writing style involves focusing on elements like tone, word choice, and sentence structure. The tone of a text can range from formal to conversational, serious to humorous. Pay attention to how the tone influences the reader’s engagement.

Word choice is another critical element. Notice whether the language is simple or complex, abstract or concrete. This can reveal a lot about the author’s intent and audience.

Finally, examine the sentence structure. Short, punchy sentences can create a fast-paced narrative, while longer, complex sentences might be used for detailed descriptions or arguments. Understanding these elements can help you develop a versatile and effective writing style in your writing.

Theme and Content Analysis

Identifying themes and main ideas is key in text analysis. Themes are the underlying messages or central ideas of a text. To find them, look for recurring topics or concepts. Ask yourself what the author is trying to convey about these topics.

For complex ideas, break them down into smaller parts. Analyze each part separately and consider how they connect. Look for patterns or contrasts in the text. This helps in understanding the broader theme or message.

Summarizing each paragraph can also be helpful. It allows you to see how ideas develop and interact throughout the text, leading to a clearer understanding of the main theme.

Character and Plot Analysis (For Fiction)

In analyzing characters in fiction, focus on their development, motivations, and interactions. Look at how they evolve and respond to challenges. This understanding can enrich your character creation.

Plot analysis involves understanding the sequence of events and their impact on the narrative. Identify key plot points and how they drive the story forward. Notice the conflict and resolution patterns.

Applying these insights to your writing can enhance character depth and plot structure. Use character analysis to create believable, dynamic characters. For plot, borrow structural elements like rising action, climax, and resolution to craft compelling narratives. Understanding these elements in existing texts can significantly improve your storytelling skills.

Use of Literary Devices

Recognizing literary devices like metaphors , similes, and symbolism requires attention to detail. Metaphors and similes create vivid imagery by comparing things, often enhancing a reader’s understanding and experience. Symbolism, on the other hand, involves using objects or actions to represent deeper meanings or concepts.

These devices add depth and layers to writing, allowing readers to engage with the text on a more meaningful level. To incorporate them effectively in your own writing, practice identifying them in texts you read. Then, experiment with using them to add richness and complexity to your narratives or descriptions, enhancing the overall impact of your writing.

writing text analysis

Comparing Texts

Comparing and contrasting texts is like using the best coursework writing service – it’s about finding quality insights from different sources. Start by choosing texts with similarities in theme or style, then identify their differences. Look at aspects like tone, structure, and literary devices. Note how each text approaches these elements uniquely.

This practice broadens your perspective, exposing you to diverse writing styles and ideas. By analyzing these differences and similarities, you can develop a more nuanced understanding of writing techniques, which can then be applied to enhance your own writing style and content.

Applying Analysis to Writing

Applying insights from text analysis to your writing can significantly improve your skills. Use the structures you’ve identified to organize your content effectively. If a certain tone resonates with you, try incorporating a similar style in your writing. Experiment with literary devices you’ve analyzed to add depth and interest to your work.

Remember, experimentation is key. Don’t be afraid to try different techniques and styles. This process helps you find your unique voice and enhances your writing versatility. Keep practicing and revisiting the texts you admire to continually refine and evolve your writing style.

Text analysis is an invaluable tool for writers, offering insights into various writing styles, structures, and techniques. By regularly analyzing texts , you can enhance your understanding of effective writing and apply these learnings to your work. Embrace this practice as part of your writing routine. It can sharpen your skills, broaden your perspectives, and ultimately lead to more refined and compelling writing. Keep exploring and learning from different texts to continually grow as a writer.

Logo for Open Oregon Educational Resources

Analyzing a Text

Written texts.

When you analyze an essay or article, consider these questions:

  • What is the thesis or central idea of the text?
  • Who is the intended audience?
  • What questions does the author address?
  • How does the author structure the text?
  • What are the key parts of the text?
  • How do the key parts of the text interrelate?
  • How do the key parts of the text relate to the thesis?
  • What does the author do to generate interest in the argument?
  • How does the author convince the readers of their argument’s merit?
  • What evidence is provided in support of the thesis?
  • Is the evidence in the text convincing?
  • Has the author anticipated opposing views and countered them?
  • Is the author’s reasoning sound?

Visual Texts

When you analyze a piece of visual work, consider these questions:

  • What confuses, surprises, or interests you about the image?
  • In what medium is the visual?
  • Where is the visual from?
  • Who created the visual?
  • For what purpose was the visual created?
  • Identify any clues that suggest the visual’s intended audience.
  • How does this image appeal to that audience?
  • In the case of advertisements, what product is the visual selling?
  • In the case of advertisements, is the visual selling an additional message or idea?
  • If words are included in the visual, how do they contribute to the meaning?
  • Identify design elements – colors, shapes, perspective, and background – and speculate how they help to convey the visual’s meaning or purpose.

About Writing: A Guide Copyright © 2015 by Robin Jeffrey is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • How to write a rhetorical analysis | Key concepts & examples

How to Write a Rhetorical Analysis | Key Concepts & Examples

Published on August 28, 2020 by Jack Caulfield . Revised on July 23, 2023.

A rhetorical analysis is a type of essay  that looks at a text in terms of rhetoric. This means it is less concerned with what the author is saying than with how they say it: their goals, techniques, and appeals to the audience.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes


Table of contents

Key concepts in rhetoric, analyzing the text, introducing your rhetorical analysis, the body: doing the analysis, concluding a rhetorical analysis, other interesting articles, frequently asked questions about rhetorical analysis.

Rhetoric, the art of effective speaking and writing, is a subject that trains you to look at texts, arguments and speeches in terms of how they are designed to persuade the audience. This section introduces a few of the key concepts of this field.

Appeals: Logos, ethos, pathos

Appeals are how the author convinces their audience. Three central appeals are discussed in rhetoric, established by the philosopher Aristotle and sometimes called the rhetorical triangle: logos, ethos, and pathos.

Logos , or the logical appeal, refers to the use of reasoned argument to persuade. This is the dominant approach in academic writing , where arguments are built up using reasoning and evidence.

Ethos , or the ethical appeal, involves the author presenting themselves as an authority on their subject. For example, someone making a moral argument might highlight their own morally admirable behavior; someone speaking about a technical subject might present themselves as an expert by mentioning their qualifications.

Pathos , or the pathetic appeal, evokes the audience’s emotions. This might involve speaking in a passionate way, employing vivid imagery, or trying to provoke anger, sympathy, or any other emotional response in the audience.

These three appeals are all treated as integral parts of rhetoric, and a given author may combine all three of them to convince their audience.

Text and context

In rhetoric, a text is not necessarily a piece of writing (though it may be this). A text is whatever piece of communication you are analyzing. This could be, for example, a speech, an advertisement, or a satirical image.

In these cases, your analysis would focus on more than just language—you might look at visual or sonic elements of the text too.

The context is everything surrounding the text: Who is the author (or speaker, designer, etc.)? Who is their (intended or actual) audience? When and where was the text produced, and for what purpose?

Looking at the context can help to inform your rhetorical analysis. For example, Martin Luther King, Jr.’s “I Have a Dream” speech has universal power, but the context of the civil rights movement is an important part of understanding why.

Claims, supports, and warrants

A piece of rhetoric is always making some sort of argument, whether it’s a very clearly defined and logical one (e.g. in a philosophy essay) or one that the reader has to infer (e.g. in a satirical article). These arguments are built up with claims, supports, and warrants.

A claim is the fact or idea the author wants to convince the reader of. An argument might center on a single claim, or be built up out of many. Claims are usually explicitly stated, but they may also just be implied in some kinds of text.

The author uses supports to back up each claim they make. These might range from hard evidence to emotional appeals—anything that is used to convince the reader to accept a claim.

The warrant is the logic or assumption that connects a support with a claim. Outside of quite formal argumentation, the warrant is often unstated—the author assumes their audience will understand the connection without it. But that doesn’t mean you can’t still explore the implicit warrant in these cases.

For example, look at the following statement:

We can see a claim and a support here, but the warrant is implicit. Here, the warrant is the assumption that more likeable candidates would have inspired greater turnout. We might be more or less convinced by the argument depending on whether we think this is a fair assumption.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Rhetorical analysis isn’t a matter of choosing concepts in advance and applying them to a text. Instead, it starts with looking at the text in detail and asking the appropriate questions about how it works:

  • What is the author’s purpose?
  • Do they focus closely on their key claims, or do they discuss various topics?
  • What tone do they take—angry or sympathetic? Personal or authoritative? Formal or informal?
  • Who seems to be the intended audience? Is this audience likely to be successfully reached and convinced?
  • What kinds of evidence are presented?

By asking these questions, you’ll discover the various rhetorical devices the text uses. Don’t feel that you have to cram in every rhetorical term you know—focus on those that are most important to the text.

The following sections show how to write the different parts of a rhetorical analysis.

Like all essays, a rhetorical analysis begins with an introduction . The introduction tells readers what text you’ll be discussing, provides relevant background information, and presents your thesis statement .

Hover over different parts of the example below to see how an introduction works.

Martin Luther King, Jr.’s “I Have a Dream” speech is widely regarded as one of the most important pieces of oratory in American history. Delivered in 1963 to thousands of civil rights activists outside the Lincoln Memorial in Washington, D.C., the speech has come to symbolize the spirit of the civil rights movement and even to function as a major part of the American national myth. This rhetorical analysis argues that King’s assumption of the prophetic voice, amplified by the historic size of his audience, creates a powerful sense of ethos that has retained its inspirational power over the years.

The body of your rhetorical analysis is where you’ll tackle the text directly. It’s often divided into three paragraphs, although it may be more in a longer essay.

Each paragraph should focus on a different element of the text, and they should all contribute to your overall argument for your thesis statement.

Hover over the example to explore how a typical body paragraph is constructed.

King’s speech is infused with prophetic language throughout. Even before the famous “dream” part of the speech, King’s language consistently strikes a prophetic tone. He refers to the Lincoln Memorial as a “hallowed spot” and speaks of rising “from the dark and desolate valley of segregation” to “make justice a reality for all of God’s children.” The assumption of this prophetic voice constitutes the text’s strongest ethical appeal; after linking himself with political figures like Lincoln and the Founding Fathers, King’s ethos adopts a distinctly religious tone, recalling Biblical prophets and preachers of change from across history. This adds significant force to his words; standing before an audience of hundreds of thousands, he states not just what the future should be, but what it will be: “The whirlwinds of revolt will continue to shake the foundations of our nation until the bright day of justice emerges.” This warning is almost apocalyptic in tone, though it concludes with the positive image of the “bright day of justice.” The power of King’s rhetoric thus stems not only from the pathos of his vision of a brighter future, but from the ethos of the prophetic voice he adopts in expressing this vision.

Prevent plagiarism. Run a free check.

The conclusion of a rhetorical analysis wraps up the essay by restating the main argument and showing how it has been developed by your analysis. It may also try to link the text, and your analysis of it, with broader concerns.

Explore the example below to get a sense of the conclusion.

It is clear from this analysis that the effectiveness of King’s rhetoric stems less from the pathetic appeal of his utopian “dream” than it does from the ethos he carefully constructs to give force to his statements. By framing contemporary upheavals as part of a prophecy whose fulfillment will result in the better future he imagines, King ensures not only the effectiveness of his words in the moment but their continuing resonance today. Even if we have not yet achieved King’s dream, we cannot deny the role his words played in setting us on the path toward it.

If you want to know more about AI tools , college essays , or fallacies make sure to check out some of our other articles with explanations and examples or go directly to our tools!

  • Ad hominem fallacy
  • Post hoc fallacy
  • Appeal to authority fallacy
  • False cause fallacy
  • Sunk cost fallacy

College essays

  • Choosing Essay Topic
  • Write a College Essay
  • Write a Diversity Essay
  • College Essay Format & Structure
  • Comparing and Contrasting in an Essay

 (AI) Tools

  • Grammar Checker
  • Paraphrasing Tool
  • Text Summarizer
  • AI Detector
  • Plagiarism Checker
  • Citation Generator

The goal of a rhetorical analysis is to explain the effect a piece of writing or oratory has on its audience, how successful it is, and the devices and appeals it uses to achieve its goals.

Unlike a standard argumentative essay , it’s less about taking a position on the arguments presented, and more about exploring how they are constructed.

The term “text” in a rhetorical analysis essay refers to whatever object you’re analyzing. It’s frequently a piece of writing or a speech, but it doesn’t have to be. For example, you could also treat an advertisement or political cartoon as a text.

Logos appeals to the audience’s reason, building up logical arguments . Ethos appeals to the speaker’s status or authority, making the audience more likely to trust them. Pathos appeals to the emotions, trying to make the audience feel angry or sympathetic, for example.

Collectively, these three appeals are sometimes called the rhetorical triangle . They are central to rhetorical analysis , though a piece of rhetoric might not necessarily use all of them.

In rhetorical analysis , a claim is something the author wants the audience to believe. A support is the evidence or appeal they use to convince the reader to believe the claim. A warrant is the (often implicit) assumption that links the support with the claim.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Caulfield, J. (2023, July 23). How to Write a Rhetorical Analysis | Key Concepts & Examples. Scribbr. Retrieved February 22, 2024, from

Is this article helpful?

Jack Caulfield

Jack Caulfield

Other students also liked, how to write an argumentative essay | examples & tips, how to write a literary analysis essay | a step-by-step guide, comparing and contrasting in an essay | tips & examples, what is your plagiarism score.

Follow the assignment closely!  A textual analysis, like any other writing, has to have a specific audience and purpose, and you must carefully write it to serve that audience and fulfill that specific purpose.

�          In any analysis, the first sentence or the topic sentence mentions the title, author and main point of the article, and is written in grammatically correct English.

�          An analysis is written in your own words and takes the text apart bit by bit . It usually includes very few quotes but many references to the original text. It analyzes the text somewhat like a forensics lab analyzes evidence for clues: carefully, meticulously and in fine detail.  

�          In this particular type of reading analysis, you are not looking at all of the main ideas in a text, or the structure of the text.  Instead, y ou are given a question that has you explore just one or two main ideas in the text and you have to explain in detail what the text says about the assigned idea(s), focusing only on the content of the text.   Do not include your own response to the text.

�          An analysis is very specific, and should not include vague, poofy generalities.

�          The most common serious errors in this type of text analysis are * including irrelevant ideas from the text, * inserting your own opinions, or * omitting key relevant information from the text.

�            Any analysis is very closely focused on the text being analyzed, and is not the place to introduce your own original lines of thought, opinions, discussion or reaction on the ideas in question.  

�          When you quote anything from the original text, even an unusual word or a catchy phrase, you need to put whatever you quote in quotation marks (� �).  A good rule of thumb is that if the word or phrase you quote is not part of your own ordinary vocabulary (or the ordinary vocabulary of your intended audience), use quotation marks.  Quotes should be rare.

�          An analysis should end appropriately with a sense of closure (and not just stop because you run out of things to write!) and should finish up with a renewed emphasis on the ideas in question. However, DO NOT repeat what you wrote at the beginning of the analysis. 

�          It is not possible to analyze a text without reading the text through carefully first and understanding it.      

In an effective reading analysis paper:

Surface errors are few and do not distract the reader. 

OW ENGL 0310 rev 2/06

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Textual Analysis | Guide, 3 Approaches & Examples

Textual Analysis | Guide, 3 Approaches & Examples

Published on 7 May 2022 by Jack Caulfield .

Textual analysis is a broad term for various research methods used to describe, interpret and understand texts. All kinds of information can be gleaned from a text – from its literal meaning to the subtext, symbolism, assumptions, and values it reveals.

The methods used to conduct textual analysis depend on the field and the aims of the research. It often aims to connect the text to a broader social, political, cultural, or artistic context.

Table of contents

What is a text, textual analysis in cultural and media studies, textual analysis in the social sciences, textual analysis in literary studies.

The term ‘text’ is broader than it seems. A text can be a piece of writing, such as a book, an email message, or a transcribed conversation. But in this context, a text can also be any object whose meaning and significance you want to interpret in depth: a film, an image, an artifact, even a place.

The methods you use to analyse a text will vary according to the type of object and the purpose of your analysis:

  • Analysis of a short story might focus on the imagery, narrative perspective, and structure of the text.
  • To analyse a film, not only the dialogue but also the cinematography and use of sound could be relevant to the analysis.
  • A building might be analysed in terms of its architectural features and how it is navigated by visitors.
  • You could analyse the rules of a game and what kind of behaviour they are designed to encourage in players.

While textual analysis is most commonly applied to written language, bear in mind how broad the term ‘text’ is and how varied the methods involved can be.

Prevent plagiarism, run a free check.

In the fields of cultural studies and media studies, textual analysis is a key component of research. Researchers in these fields take media and cultural objects – for example, music videos, social media content, billboard advertising – and treat them as texts to be analysed.

Usually working within a particular theoretical framework (e.g., postcolonial theory, media theory, semiotics), researchers seek to connect elements of their texts with issues in contemporary politics and culture. They might analyse many different aspects of the text:

  • Word choice
  • Design elements
  • Location of the text
  • Target audience
  • Relationship with other texts

Textual analysis in this context is usually creative and qualitative in its approach. Researchers seek to illuminate something about the underlying politics or social context of the cultural object they’re investigating.

In the social sciences, textual analysis is often applied to texts such as interview transcripts and surveys , as well as to various types of media. Social scientists use textual data to draw empirical conclusions about social relations.

Textual analysis in the social sciences sometimes takes a more quantitative approach , where the features of texts are measured numerically. For example, a researcher might investigate how often certain words are repeated in social media posts, or which colours appear most prominently in advertisements for products targeted at different demographics.

Some common methods of analysing texts in the social sciences include content analysis , thematic analysis , and discourse analysis .

Textual analysis is the most important method in literary studies. Almost all work in this field involves in-depth analysis of texts – in this context, usually novels, poems, stories, or plays.

Because it deals with literary writing, this type of textual analysis places greater emphasis on the deliberately constructed elements of a text: for example, rhyme and metre in a poem, or narrative perspective in a novel. Researchers aim to understand and explain how these elements contribute to the text’s meaning.

However, literary analysis doesn’t just involve discovering the author’s intended meaning. It often also explores potentially unintended connections between different texts, asks what a text reveals about the context in which it was written, or seeks to analyse a classic text in a new and unexpected way.

Some well-known examples of literary analysis show the variety of approaches that can be taken:

  • Eve Kosofky Sedgwick’s book Between Men analyses Victorian literature in light of more contemporary perspectives on gender and sexuality.
  • Roland Barthes’ S/Z provides an in-depth structural analysis of a short story by Balzac.
  • Harold Bloom’s The Anxiety of Influence applies his own ‘influence theory’ to an analysis of various classic poets.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Caulfield, J. (2022, May 07). Textual Analysis | Guide, 3 Approaches & Examples. Scribbr. Retrieved 22 February 2024, from

Is this article helpful?

Jack Caulfield

Jack Caulfield

Other students also liked, content analysis | a step-by-step guide with examples, what is ethnography | meaning, guide & examples, critical discourse analysis | definition, guide & examples.

  • PRO Courses Guides New Tech Help Pro Expert Videos About wikiHow Pro Upgrade Sign In
  • EDIT Edit this Article
  • EXPLORE Tech Help Pro About Us Random Article Quizzes Request a New Article Community Dashboard This Or That Game Popular Categories Arts and Entertainment Artwork Books Movies Computers and Electronics Computers Phone Skills Technology Hacks Health Men's Health Mental Health Women's Health Relationships Dating Love Relationship Issues Hobbies and Crafts Crafts Drawing Games Education & Communication Communication Skills Personal Development Studying Personal Care and Style Fashion Hair Care Personal Hygiene Youth Personal Care School Stuff Dating All Categories Arts and Entertainment Finance and Business Home and Garden Relationship Quizzes Cars & Other Vehicles Food and Entertaining Personal Care and Style Sports and Fitness Computers and Electronics Health Pets and Animals Travel Education & Communication Hobbies and Crafts Philosophy and Religion Work World Family Life Holidays and Traditions Relationships Youth
  • Browse Articles
  • Learn Something New
  • Quizzes Hot
  • This Or That Game New
  • Train Your Brain
  • Explore More
  • Support wikiHow
  • About wikiHow
  • Log in / Sign up
  • Education and Communications
  • Article Writing

How to Analyze Texts

Last Updated: February 15, 2023 Fact Checked

This article was co-authored by Christopher Taylor, PhD and by wikiHow staff writer, Danielle Blinka, MA, MPA . Christopher Taylor is an Adjunct Assistant Professor of English at Austin Community College in Texas. He received his PhD in English Literature and Medieval Studies from the University of Texas at Austin in 2014. This article has been fact-checked, ensuring the accuracy of any cited facts and confirming the authority of its sources. This article has been viewed 91,875 times.

Throughout your academic studies, you’ll be expected to analyze many texts. Analyzing a text on your own can be very intimidating, but it gets easier once you know how to do it. Before analyzing any text, you’ll need to thoroughly study it. Then, tailor your analysis to fit either fiction or nonfiction. Finally, you can write an analysis passage, if necessary.

Studying the Text

Step 1 Write out essential questions or learning objectives for the text.

  • Include your answers to these questions or objectives in your notes about the text.

Step 2 Read the text.

  • Although it’s best to read the text at least twice, this may be harder with longer texts. If this is the case, you can re-read difficult passages within the book.

Step 3 Annotate...

  • For example, use a yellow highlighter to indicate main ideas, and use an orange highlighter to mark the supporting details.
  • For fiction, use a different colored highlighter for passages related to each main character.

Step 4 Take notes as you read.

  • For a fiction text, write down the names and basic information about characters. Additionally, make note of any symbolism and use of literary devices.
  • For a nonfiction text, write down important facts, figures, methods, and dates.

Step 5 Summarize each section of the text.

  • For example, summarize each chapter of a novel. On the other hand, summarize each paragraph of a small article.

Step 6 Write out your own response to the text.

  • What am I taking away from the piece?
  • How do I feel about the topic?
  • Did this text entertain me or inform me?
  • What will I do with this information now?
  • How does this text apply to real life?

Step 7 Make a reverse...

  • For a work of fiction, outline the plot of the story, as well as any important details and literary devices.
  • For a nonfiction text, focus on the main points, evidence, and supporting details.

Step 8 Read other analyses of the text.

  • These analyses are easy to find through a quick internet search. Just type in the name of your text followed by the word, "analysis."

Examining Fiction

Step 1 Review the context of the text, such as when it was written.

  • When was the text written?
  • What is the historical background of the work?
  • What is the author’s background?
  • What genre does the author work in?
  • Who are the author's contemporaries?
  • How does this text fit in with the author's larger body of work?
  • Did the writer provide their inspiration for the text?
  • What type of society does the author come from?
  • How does the text’s time period shape its meaning?

Step 2 Identify the theme of the text.

  • A short story might have 1 or 2 themes, while a novel might have several. If the text has several themes, they might be related.
  • For example, the themes of a sci-fi novel might be “technology is dangerous” and “cooperation can overcome tyranny.”

Step 3 Determine the main ideas of the text.

  • Notice the character’s words, actions, and thoughts. Consider what they convey about the character, as well as possible themes.
  • Watch for symbolism, metaphor, and the use of other literary devices.

Step 4 Identify pieces of text that support the main ideas.

  • You can use these quotes to support your own claims about the text if you write an analysis essay.

Step 5 Examine the author’s writing style.

  • For example, Edgar Allan Poe’s style of writing enhanced the effect of his poems and stories in an intentional way. If you were analyzing one of his texts, you’d want to consider his individual style.
  • As another example, Mark Twain uses dialect in his novel Pudd'nhead Wilson to show the differences between slave owners and slaves in the deep south. Twain uses word choice and syntax to show how language can be used to create a divide in society, as well as control a subsection of the population.

Step 6 Consider the author's...

  • Common tones include sad, solemn, suspenseful, humorous, or sarcastic.
  • Tone can be indicative of not only what's happening in the piece, but of larger themes. The Wonderful Wizard of Oz changes tone, for example, when Dorothy leaves Kansas for Oz. This is seen in the film through the change in color, but in the novel, this is established through the shift in tone.

Evaluating Nonfiction

Step 1 Determine the author’s purpose.

  • What is the topic and discipline?
  • What does the text accomplish?
  • What does the author make you think, believe, or feel?
  • Are the ideas in the text new or borrowed from someone else?

Step 2 Examine the writer’s use of language, including jargon.

  • Using jargon and technical language shows the author is writing for people in their field. They might be trying to instruct or may be presenting research ideas. If you're unsure of a writer's intended audience, technical terms and jargon can be a good indicator.
  • The tone is the mood of a text. For example, a researcher might use a formal, professional tone to present their research findings, while a writer might use an informal, casual tone when writing a magazine article.

Step 3 Identify the author’s argument.

  • If you’re struggling to find the author’s argument, review the evidence they provide in the text. What ideas does the evidence support? This can help you find the argument.
  • For example, the thesis could read as follows: "Based on data and case studies, voters are more likely to choose a candidate they know, supporting the ideas of rational choice theory." The argument here is in favor of rational choice theory.

Step 4 Examine the evidence the author uses to support the argument.

  • For example, evidence that includes research and statistical data may provide a lot of support for an argument, but anecdotal evidence might result in a weak argument.
  • You may want to write out the evidence in your own words, but this may not be necessary.

Step 5 Separate facts from opinions in a nonfiction text.

  • For example, you might highlight facts and opinions using different colors. Alternatively, you might create a chart with facts on one side and opinions on the other.
  • For instance, the writer might state, "According to the survey, 79% of people skim a ballot to find the names they know. Clearly, ballots aren't designed to engage voter interest." The first sentence is a fact, while the second sentence is an opinion.

Step 6 Determine if the text accomplishes its purpose.

  • For example, you might find that the paper on rational choice theory contains few statistics but many pieces of anecdotal evidence. This might lead you to doubt the writer's argument, which means the writer likely didn't achieve their purpose.

Writing an Analysis Paragraph

Step 1 Create a topic sentence explaining your views on the text.

  • Here’s an example: “In the short story ‘Quicksand,’ the author uses quicksand as a metaphor for living with chronic illness.”
  • This is another example: "In the novel Frankenstein , Shelley displays the conventions of the Romantic Period by suggesting that nature has restorative powers."

Step 2 Introduce your supporting text by explaining its context.

  • You could write, “At the beginning of the story, the main character wakes up, dreading the coming day. She knows she needs to get out of bed, but her illness prevents her from rising.”

Step 3 Provide your supporting text, using a lead-in.

  • For example, “To show the struggle, the author writes, ‘I sank back into the bed, feeling as though the mattress was sucking me further and further down.’”
  • As another example, "In Frankenstein , Victor escapes from his problems by frequently going out into nature. After spending two days in nature, Victor says, "By degrees, the calm and heavenly scene restored me..." (Shelley 47).

Step 4 Explain how the supporting text backs up your ideas.

  • You might write, “In this passage, the author builds on the metaphor of an illness acting like quicksand by showing the main character struggling to get out of bed. Despite fighting to get up, the main character feels as though they’re sinking further into the bed. Furthermore, the author uses first-person point-of-view to help the reader understand the main character’s thoughts and feelings on their illness.”

Sample Analyses

writing text analysis

Expert Q&A

  • Study guides like Cliff’s Notes can help you analyze a longer text, which is harder to re-read. Thanks Helpful 0 Not Helpful 0
  • Working with a partner or group can help you better understand a text because you can see it from different perspectives. However, make sure any written analysis you do is your own work, not the group’s. Thanks Helpful 0 Not Helpful 0

writing text analysis

  • Always use quotation marks and a lead in when directly quoting a passage . Otherwise, you’ll be plagiarizing. Thanks Helpful 0 Not Helpful 0

You Might Also Like

Write an Article Review

  • ↑
  • ↑
  • ↑
  • ↑
  • ↑
  • ↑

About This Article

Christopher Taylor, PhD

To analyze a text, read over it slowly and carefully, making sure to highlight important information, like main ideas and supporting details. If your teacher gives you any study questions, start by reading the text with these in mind. Otherwise, try to come up with a few questions of your own, like what's the main point and how does the author achieve this? As you read, highlight key passages or make notes in the margins about the main ideas or key passages you come across that help you answer these questions. Another way to analyze a text is to write a short summary of every paragraph or chapter to make sure you fully understand what the author's main points are. For more tips from our Professor co-author, including how to write an analysis paragraph, keep reading! Did this summary help you? Yes No

  • Send fan mail to authors

Did this article help you?

writing text analysis

Featured Articles

Start a Text Conversation with a Girl

Trending Articles

How to Take the Perfect Thirst Trap

Watch Articles

Wrap a Round Gift

  • Terms of Use
  • Privacy Policy
  • Do Not Sell or Share My Info
  • Not Selling Info

Get all the best how-tos!

Sign up for wikiHow's weekly email newsletter


The Power of Analysis: Tips and Tricks for Writing Analysis Essays: Home

Crystal sosa.

Profile Photo

Helpful Links

  • Super Search Webpage Where to start your research.
  • Scribbr Textual analysis guide.
  • Analyzing Texts Prezi Presentation A prezi presentation on analyzing texts.
  • Writing Essays Guide A guide to writing essays/
  • Why is it important?
  • Explanation & Example
  • Different Types of Analysis Essays

writing text analysis

Text analysis and writing analysis texts are important skills to develop as they allow individuals to critically engage with written material, understand underlying themes and arguments, and communicate their own ideas in a clear and effective manner. These skills are essential in academic and professional settings, as well as in everyday life, as they enable individuals to evaluate information and make informed decisions.

What is Text Analysis?

Text analysis is the process of examining and interpreting a written or spoken text to understand its meaning, structure, and context. It involves breaking down the text into its constituent parts, such as words, phrases, and sentences, and analyzing how they work together to convey a particular message or idea.

Text analysis can be used to explore a wide range of textual material, including literature, poetry, speeches, and news articles, and it is often employed in academic research, literary criticism, and media analysis. By analyzing texts, we can gain deeper insights into their meanings, uncover hidden messages and themes, and better understand the social and cultural contexts in which they were produced.

What is an Analysis Essay?

An analysis essay is a type of essay that requires the writer to analyze and interpret a particular text or topic. The goal of an analysis essay is to break down the text or topic into smaller parts and examine each part carefully. This allows the writer to make connections between different parts of the text or topic and develop a more comprehensive understanding of it.

In “The Yellow Wallpaper,” Charlotte Perkins Gilman uses the first-person point of view and vivid descriptions of the protagonist’s surroundings to convey the protagonist’s psychological deterioration. By limiting the reader’s understanding of the story’s events to the protagonist’s perspective, Gilman creates a sense of claustrophobia and paranoia, mirroring the protagonist’s own feelings. Additionally, the use of sensory language, such as the “smooch of rain,” and descriptions of the “yellow wallpaper” and its “sprawling flamboyant patterns,” further emphasize the protagonist’s sensory and emotional experience. Through these techniques, Gilman effectively communicates the protagonist’s descent into madness and the effects of societal oppression on women’s mental health.

There are several different types of analysis essays, including:

Literary Analysis Essays: These essays examine a work of literature and analyze various literary devices such as character development, plot, theme, and symbolism.

Rhetorical Analysis Essays: These essays examine how authors use language and rhetoric to persuade their audience, focusing on the author's tone, word choice, and use of rhetorical devices.

Film Analysis Essays: These essays analyze a film's themes, characters, and visual elements, such as cinematography and sound.

Visual Analysis Essays: These essays analyze visual art, such as paintings or sculptures, and explore how the artwork's elements work together to create meaning.

Historical Analysis Essays: These essays analyze historical events or documents and examine their causes, effects, and implications.

Comparative Analysis Essays: These essays compare and contrast two or more works, focusing on similarities and differences between them.

Process Analysis Essays: These essays explain how to do something or how something works, providing a step-by-step analysis of a process.

Analyzing Texts

  • General Tips
  • How to Analyze
  • What to Analyze

When writing an essay, it's essential to analyze your topic thoroughly. Here are some suggestions for analyzing your topic:

Read carefully: Start by reading your text or prompt carefully. Make sure you understand the key points and what the text or prompt is asking you to do.

Analyze the text or topic thoroughly: Analyze the text or topic thoroughly by breaking it down into smaller parts and examining each part carefully. This will help you make connections between different parts of the text or topic and develop a more comprehensive understanding of it.

Identify key concepts: Identify the key concepts, themes, and ideas in the text or prompt. This will help you focus your analysis.

Take notes: Take notes on important details and concepts as you read. This will help you remember what you've read and organize your thoughts.

Consider different perspectives: Consider different perspectives and interpretations of the text or prompt. This can help you create a more well-rounded analysis.

Use evidence: Use evidence from the text or outside sources to support your analysis. This can help you make your argument stronger and more convincing.

Formulate your thesis statement: Based on your analysis of the essay, formulate your thesis statement. This should be a clear and concise statement that summarizes your main argument.

Use clear and concise language: Use clear and concise language to communicate your ideas effectively. Avoid using overly complicated language that may confuse your reader.

Revise and edit: Revise and edit your essay carefully to ensure that it is clear, concise, and free of errors.

  • Understanding the assignment: Make sure you fully understand the assignment and the purpose of the analysis. This will help you focus your analysis and ensure that you are meeting the requirements of the assignment.

Read the essay multiple times: Reading the essay multiple times will help you to identify the author's main argument, key points, and supporting evidence.

Take notes: As you read the essay, take notes on key points, quotes, and examples. This will help you to organize your thoughts and identify patterns in the author's argument.

Take breaks: It's important to take breaks while reading academic essays to avoid burnout. Take a break every 20-30 minutes and do something completely different, like going for a walk or listening to music. This can help you to stay refreshed and engaged.

Highlight or underline key points: As you read, highlight or underline key points, arguments, and evidence that stand out to you. This will help you to remember and analyze important information later.

Ask questions: Ask yourself questions as you read to help you engage critically with the text. What is the author's argument? What evidence do they use to support their claims? What are the strengths and weaknesses of their argument?

Engage in active reading: Instead of passively reading, engage in active reading by asking questions, making connections to other readings or personal experiences, and reflecting on what you've read.

Find a discussion partner: Find someone to discuss the essay with, whether it's a classmate, a friend, or a teacher. Discussing the essay can help you to process and analyze the information more deeply, and can also help you to stay engaged.

  • Identify the author's purpose and audience: Consider why the author wrote the essay and who their intended audience is. This will help you to better understand the author's perspective and the purpose of their argument.

Analyze the structure of the essay: Consider how the essay is structured and how this supports the author's argument. Look for patterns in the organization of ideas and the use of transitions.

Evaluate the author's use of evidence: Evaluate the author's use of evidence and how it supports their argument. Consider whether the evidence is credible, relevant, and sufficient to support the author's claims.

Consider the author's tone and style: Consider the author's tone and style and how it contributes to their argument. Look for patterns in the use of language, imagery, and rhetorical devices.

Consider the context : Consider the context in which the essay was written, such as the author's background, the time period, and any societal or cultural factors that may have influenced their perspective.

Evaluate the evidence: Evaluate the evidence presented in the essay and consider whether it is sufficient to support the author's argument. Look for any biases or assumptions that may be present in the evidence.

Consider alternative viewpoints: Consider alternative viewpoints and arguments that may challenge the author's perspective. This can help you to engage critically with the text and develop a more well-rounded understanding of the topic.

writing text analysis

  • Last Updated: Jun 21, 2023 1:01 PM
  • URL:

Creative Commons License

beginner's guide to literary analysis

Understanding literature & how to write literary analysis.

Literary analysis is the foundation of every college and high school English class. Once you can comprehend written work and respond to it, the next step is to learn how to think critically and complexly about a work of literature in order to analyze its elements and establish ideas about its meaning.

If that sounds daunting, it shouldn’t. Literary analysis is really just a way of thinking creatively about what you read. The practice takes you beyond the storyline and into the motives behind it. 

While an author might have had a specific intention when they wrote their book, there’s still no right or wrong way to analyze a literary text—just your way. You can use literary theories, which act as “lenses” through which you can view a text. Or you can use your own creativity and critical thinking to identify a literary device or pattern in a text and weave that insight into your own argument about the text’s underlying meaning. 

Now, if that sounds fun, it should , because it is. Here, we’ll lay the groundwork for performing literary analysis, including when writing analytical essays, to help you read books like a critic. 

What Is Literary Analysis?

As the name suggests, literary analysis is an analysis of a work, whether that’s a novel, play, short story, or poem. Any analysis requires breaking the content into its component parts and then examining how those parts operate independently and as a whole. In literary analysis, those parts can be different devices and elements—such as plot, setting, themes, symbols, etcetera—as well as elements of style, like point of view or tone. 

When performing analysis, you consider some of these different elements of the text and then form an argument for why the author chose to use them. You can do so while reading and during class discussion, but it’s particularly important when writing essays. 

Literary analysis is notably distinct from summary. When you write a summary , you efficiently describe the work’s main ideas or plot points in order to establish an overview of the work. While you might use elements of summary when writing analysis, you should do so minimally. You can reference a plot line to make a point, but it should be done so quickly so you can focus on why that plot line matters . In summary (see what we did there?), a summary focuses on the “ what ” of a text, while analysis turns attention to the “ how ” and “ why .”

While literary analysis can be broad, covering themes across an entire work, it can also be very specific, and sometimes the best analysis is just that. Literary critics have written thousands of words about the meaning of an author’s single word choice; while you might not want to be quite that particular, there’s a lot to be said for digging deep in literary analysis, rather than wide. 

Although you’re forming your own argument about the work, it’s not your opinion . You should avoid passing judgment on the piece and instead objectively consider what the author intended, how they went about executing it, and whether or not they were successful in doing so. Literary criticism is similar to literary analysis, but it is different in that it does pass judgement on the work. Criticism can also consider literature more broadly, without focusing on a singular work. 

Once you understand what constitutes (and doesn’t constitute) literary analysis, it’s easy to identify it. Here are some examples of literary analysis and its oft-confused counterparts: 

Summary: In “The Fall of the House of Usher,” the narrator visits his friend Roderick Usher and witnesses his sister escape a horrible fate.  

Opinion: In “The Fall of the House of Usher,” Poe uses his great Gothic writing to establish a sense of spookiness that is enjoyable to read. 

Literary Analysis: “Throughout ‘The Fall of the House of Usher,’ Poe foreshadows the fate of Madeline by creating a sense of claustrophobia for the reader through symbols, such as in the narrator’s inability to leave and the labyrinthine nature of the house. 

In summary, literary analysis is:

  • Breaking a work into its components
  • Identifying what those components are and how they work in the text
  • Developing an understanding of how they work together to achieve a goal 
  • Not an opinion, but subjective 
  • Not a summary, though summary can be used in passing 
  • Best when it deeply, rather than broadly, analyzes a literary element

Literary Analysis and Other Works

As discussed above, literary analysis is often performed upon a single work—but it doesn’t have to be. It can also be performed across works to consider the interplay of two or more texts. Regardless of whether or not the works were written about the same thing, or even within the same time period, they can have an influence on one another or a connection that’s worth exploring. And reading two or more texts side by side can help you to develop insights through comparison and contrast.

For example, Paradise Lost is an epic poem written in the 17th century, based largely on biblical narratives written some 700 years before and which later influenced 19th century poet John Keats. The interplay of works can be obvious, as here, or entirely the inspiration of the analyst. As an example of the latter, you could compare and contrast the writing styles of Ralph Waldo Emerson and Edgar Allan Poe who, while contemporaries in terms of time, were vastly different in their content. 

Additionally, literary analysis can be performed between a work and its context. Authors are often speaking to the larger context of their times, be that social, political, religious, economic, or artistic. A valid and interesting form is to compare the author’s context to the work, which is done by identifying and analyzing elements that are used to make an argument about the writer’s time or experience. 

For example, you could write an essay about how Hemingway’s struggles with mental health and paranoia influenced his later work, or how his involvement in the Spanish Civil War influenced his early work. One approach focuses more on his personal experience, while the other turns to the context of his times—both are valid. 

Why Does Literary Analysis Matter? 

Sometimes an author wrote a work of literature strictly for entertainment’s sake, but more often than not, they meant something more. Whether that was a missive on world peace, commentary about femininity, or an allusion to their experience as an only child, the author probably wrote their work for a reason, and understanding that reason—or the many reasons—can actually make reading a lot more meaningful. 

Performing literary analysis as a form of study unquestionably makes you a better reader. It’s also likely that it will improve other skills, too, like critical thinking, creativity, debate, and reasoning. 

At its grandest and most idealistic, literary analysis even has the ability to make the world a better place. By reading and analyzing works of literature, you are able to more fully comprehend the perspectives of others. Cumulatively, you’ll broaden your own perspectives and contribute more effectively to the things that matter to you. 

Literary Terms to Know for Literary Analysis 

There are hundreds of literary devices you could consider during your literary analysis, but there are some key tools most writers utilize to achieve their purpose—and therefore you need to know in order to understand that purpose. These common devices include: 

  • Characters: The people (or entities) who play roles in the work. The protagonist is the main character in the work. 
  • Conflict: The conflict is the driving force behind the plot, the event that causes action in the narrative, usually on the part of the protagonist
  • Context : The broader circumstances surrounding the work political and social climate in which it was written or the experience of the author. It can also refer to internal context, and the details presented by the narrator 
  • Diction : The word choice used by the narrator or characters 
  • Genre: A category of literature characterized by agreed upon similarities in the works, such as subject matter and tone
  • Imagery : The descriptive or figurative language used to paint a picture in the reader’s mind so they can picture the story’s plot, characters, and setting 
  • Metaphor: A figure of speech that uses comparison between two unlike objects for dramatic or poetic effect
  • Narrator: The person who tells the story. Sometimes they are a character within the story, but sometimes they are omniscient and removed from the plot. 
  • Plot : The storyline of the work
  • Point of view: The perspective taken by the narrator, which skews the perspective of the reader 
  • Setting : The time and place in which the story takes place. This can include elements like the time period, weather, time of year or day, and social or economic conditions 
  • Symbol : An object, person, or place that represents an abstract idea that is greater than its literal meaning 
  • Syntax : The structure of a sentence, either narration or dialogue, and the tone it implies
  • Theme : A recurring subject or message within the work, often commentary on larger societal or cultural ideas
  • Tone : The feeling, attitude, or mood the text presents

How to Perform Literary Analysis

Step 1: read the text thoroughly.

Literary analysis begins with the literature itself, which means performing a close reading of the text. As you read, you should focus on the work. That means putting away distractions (sorry, smartphone) and dedicating a period of time to the task at hand. 

It’s also important that you don’t skim or speed read. While those are helpful skills, they don’t apply to literary analysis—or at least not this stage. 

Step 2: Take Notes as You Read  

As you read the work, take notes about different literary elements and devices that stand out to you. Whether you highlight or underline in text, use sticky note tabs to mark pages and passages, or handwrite your thoughts in a notebook, you should capture your thoughts and the parts of the text to which they correspond. This—the act of noticing things about a literary work—is literary analysis. 

Step 3: Notice Patterns 

As you read the work, you’ll begin to notice patterns in the way the author deploys language, themes, and symbols to build their plot and characters. As you read and these patterns take shape, begin to consider what they could mean and how they might fit together. 

As you identify these patterns, as well as other elements that catch your interest, be sure to record them in your notes or text. Some examples include: 

  • Circle or underline words or terms that you notice the author uses frequently, whether those are nouns (like “eyes” or “road”) or adjectives (like “yellow” or “lush”).
  • Highlight phrases that give you the same kind of feeling. For example, if the narrator describes an “overcast sky,” a “dreary morning,” and a “dark, quiet room,” the words aren’t the same, but the feeling they impart and setting they develop are similar. 
  • Underline quotes or prose that define a character’s personality or their role in the text.
  • Use sticky tabs to color code different elements of the text, such as specific settings or a shift in the point of view. 

By noting these patterns, comprehensive symbols, metaphors, and ideas will begin to come into focus.  

Step 4: Consider the Work as a Whole, and Ask Questions

This is a step that you can do either as you read, or after you finish the text. The point is to begin to identify the aspects of the work that most interest you, and you could therefore analyze in writing or discussion. 

Questions you could ask yourself include: 

  • What aspects of the text do I not understand?
  • What parts of the narrative or writing struck me most?
  • What patterns did I notice?
  • What did the author accomplish really well?
  • What did I find lacking?
  • Did I notice any contradictions or anything that felt out of place?  
  • What was the purpose of the minor characters?
  • What tone did the author choose, and why? 

The answers to these and more questions will lead you to your arguments about the text. 

Step 5: Return to Your Notes and the Text for Evidence

As you identify the argument you want to make (especially if you’re preparing for an essay), return to your notes to see if you already have supporting evidence for your argument. That’s why it’s so important to take notes or mark passages as you read—you’ll thank yourself later!

If you’re preparing to write an essay, you’ll use these passages and ideas to bolster your argument—aka, your thesis. There will likely be multiple different passages you can use to strengthen multiple different aspects of your argument. Just be sure to cite the text correctly! 

If you’re preparing for class, your notes will also be invaluable. When your teacher or professor leads the conversation in the direction of your ideas or arguments, you’ll be able to not only proffer that idea but back it up with textual evidence. That’s an A+ in class participation. 

Step 6: Connect These Ideas Across the Narrative

Whether you’re in class or writing an essay, literary analysis isn’t complete until you’ve considered the way these ideas interact and contribute to the work as a whole. You can find and present evidence, but you still have to explain how those elements work together and make up your argument. 

How to Write a Literary Analysis Essay

When conducting literary analysis while reading a text or discussing it in class, you can pivot easily from one argument to another (or even switch sides if a classmate or teacher makes a compelling enough argument). 

But when writing literary analysis, your objective is to propose a specific, arguable thesis and convincingly defend it. In order to do so, you need to fortify your argument with evidence from the text (and perhaps secondary sources) and an authoritative tone. 

A successful literary analysis essay depends equally on a thoughtful thesis, supportive analysis, and presenting these elements masterfully. We’ll review how to accomplish these objectives below. 

Step 1: Read the Text. Maybe Read It Again. 

Constructing an astute analytical essay requires a thorough knowledge of the text. As you read, be sure to note any passages, quotes, or ideas that stand out. These could serve as the future foundation of your thesis statement. Noting these sections now will help you when you need to gather evidence. 

The more familiar you become with the text, the better (and easier!) your essay will be. Familiarity with the text allows you to speak (or in this case, write) to it confidently. If you only skim the book, your lack of rich understanding will be evident in your essay. Alternatively, if you read the text closely—especially if you read it more than once, or at least carefully revisit important passages—your own writing will be filled with insight that goes beyond a basic understanding of the storyline. 

Step 2: Brainstorm Potential Topics 

Because you took detailed notes while reading the text, you should have a list of potential topics at the ready. Take time to review your notes, highlighting any ideas or questions you had that feel interesting. You should also return to the text and look for any passages that stand out to you. 

When considering potential topics, you should prioritize ideas that you find interesting. It won’t only make the whole process of writing an essay more fun, your enthusiasm for the topic will probably improve the quality of your argument, and maybe even your writing. Just like it’s obvious when a topic interests you in a conversation, it’s obvious when a topic interests the writer of an essay (and even more obvious when it doesn’t). 

Your topic ideas should also be specific, unique, and arguable. A good way to think of topics is that they’re the answer to fairly specific questions. As you begin to brainstorm, first think of questions you have about the text. Questions might focus on the plot, such as: Why did the author choose to deviate from the projected storyline? Or why did a character’s role in the narrative shift? Questions might also consider the use of a literary device, such as: Why does the narrator frequently repeat a phrase or comment on a symbol? Or why did the author choose to switch points of view each chapter? 

Once you have a thesis question , you can begin brainstorming answers—aka, potential thesis statements . At this point, your answers can be fairly broad. Once you land on a question-statement combination that feels right, you’ll then look for evidence in the text that supports your answer (and helps you define and narrow your thesis statement). 

For example, after reading “ The Fall of the House of Usher ,” you might be wondering, Why are Roderick and Madeline twins?, Or even: Why does their relationship feel so creepy?” Maybe you noticed (and noted) that the narrator was surprised to find out they were twins, or perhaps you found that the narrator’s tone tended to shift and become more anxious when discussing the interactions of the twins.

Once you come up with your thesis question, you can identify a broad answer, which will become the basis for your thesis statement. In response to the questions above, your answer might be, “Poe emphasizes the close relationship of Roderick and Madeline to foreshadow that their deaths will be close, too.” 

Step 3: Gather Evidence 

Once you have your topic (or you’ve narrowed it down to two or three), return to the text (yes, again) to see what evidence you can find to support it. If you’re thinking of writing about the relationship between Roderick and Madeline in “The Fall of the House of Usher,” look for instances where they engaged in the text. 

This is when your knowledge of literary devices comes in clutch. Carefully study the language around each event in the text that might be relevant to your topic. How does Poe’s diction or syntax change during the interactions of the siblings? How does the setting reflect or contribute to their relationship? What imagery or symbols appear when Roderick and Madeline are together? 

By finding and studying evidence within the text, you’ll strengthen your topic argument—or, just as valuably, discount the topics that aren’t strong enough for analysis. 

writing text analysis

Step 4: Consider Secondary Sources 

In addition to returning to the literary work you’re studying for evidence, you can also consider secondary sources that reference or speak to the work. These can be articles from journals you find on JSTOR, books that consider the work or its context, or articles your teacher shared in class. 

While you can use these secondary sources to further support your idea, you should not overuse them. Make sure your topic remains entirely differentiated from that presented in the source. 

Step 5: Write a Working Thesis Statement

Once you’ve gathered evidence and narrowed down your topic, you’re ready to refine that topic into a thesis statement. As you continue to outline and write your paper, this thesis statement will likely change slightly, but this initial draft will serve as the foundation of your essay. It’s like your north star: Everything you write in your essay is leading you back to your thesis. 

Writing a great thesis statement requires some real finesse. A successful thesis statement is: 

  • Debatable : You shouldn’t simply summarize or make an obvious statement about the work. Instead, your thesis statement should take a stand on an issue or make a claim that is open to argument. You’ll spend your essay debating—and proving—your argument. 
  • Demonstrable : You need to be able to prove, through evidence, that your thesis statement is true. That means you have to have passages from the text and correlative analysis ready to convince the reader that you’re right. 
  • Specific : In most cases, successfully addressing a theme that encompasses a work in its entirety would require a book-length essay. Instead, identify a thesis statement that addresses specific elements of the work, such as a relationship between characters, a repeating symbol, a key setting, or even something really specific like the speaking style of a character. 

Example: By depicting the relationship between Roderick and Madeline to be stifling and almost otherworldly in its closeness, Poe foreshadows both Madeline’s fate and Roderick’s inability to choose a different fate for himself. 

Step 6: Write an Outline 

You have your thesis, you have your evidence—but how do you put them together? A great thesis statement (and therefore a great essay) will have multiple arguments supporting it, presenting different kinds of evidence that all contribute to the singular, main idea presented in your thesis. 

Review your evidence and identify these different arguments, then organize the evidence into categories based on the argument they support. These ideas and evidence will become the body paragraphs of your essay. 

For example, if you were writing about Roderick and Madeline as in the example above, you would pull evidence from the text, such as the narrator’s realization of their relationship as twins; examples where the narrator’s tone of voice shifts when discussing their relationship; imagery, like the sounds Roderick hears as Madeline tries to escape; and Poe’s tendency to use doubles and twins in his other writings to create the same spooky effect. All of these are separate strains of the same argument, and can be clearly organized into sections of an outline. 

Step 7: Write Your Introduction

Your introduction serves a few very important purposes that essentially set the scene for the reader: 

  • Establish context. Sure, your reader has probably read the work. But you still want to remind them of the scene, characters, or elements you’ll be discussing. 
  • Present your thesis statement. Your thesis statement is the backbone of your analytical paper. You need to present it clearly at the outset so that the reader understands what every argument you make is aimed at. 
  • Offer a mini-outline. While you don’t want to show all your cards just yet, you do want to preview some of the evidence you’ll be using to support your thesis so that the reader has a roadmap of where they’re going. 

Step 8: Write Your Body Paragraphs

Thanks to steps one through seven, you’ve already set yourself up for success. You have clearly outlined arguments and evidence to support them. Now it’s time to translate those into authoritative and confident prose. 

When presenting each idea, begin with a topic sentence that encapsulates the argument you’re about to make (sort of like a mini-thesis statement). Then present your evidence and explanations of that evidence that contribute to that argument. Present enough material to prove your point, but don’t feel like you necessarily have to point out every single instance in the text where this element takes place. For example, if you’re highlighting a symbol that repeats throughout the narrative, choose two or three passages where it is used most effectively, rather than trying to squeeze in all ten times it appears. 

While you should have clearly defined arguments, the essay should still move logically and fluidly from one argument to the next. Try to avoid choppy paragraphs that feel disjointed; every idea and argument should feel connected to the last, and, as a group, connected to your thesis. A great way to connect the ideas from one paragraph to the next is with transition words and phrases, such as: 

  • Furthermore 
  • In addition
  • On the other hand
  • Conversely 

writing text analysis

Step 9: Write Your Conclusion 

Your conclusion is more than a summary of your essay's parts, but it’s also not a place to present brand new ideas not already discussed in your essay. Instead, your conclusion should return to your thesis (without repeating it verbatim) and point to why this all matters. If writing about the siblings in “The Fall of the House of Usher,” for example, you could point out that the utilization of twins and doubles is a common literary element of Poe’s work that contributes to the definitive eeriness of Gothic literature. 

While you might speak to larger ideas in your conclusion, be wary of getting too macro. Your conclusion should still be supported by all of the ideas that preceded it. 

Step 10: Revise, Revise, Revise

Of course you should proofread your literary analysis essay before you turn it in. But you should also edit the content to make sure every piece of evidence and every explanation directly supports your thesis as effectively and efficiently as possible. 

Sometimes, this might mean actually adapting your thesis a bit to the rest of your essay. At other times, it means removing redundant examples or paraphrasing quotations. Make sure every sentence is valuable, and remove those that aren’t. 

Other Resources for Literary Analysis 

With these skills and suggestions, you’re well on your way to practicing and writing literary analysis. But if you don’t have a firm grasp on the concepts discussed above—such as literary devices or even the content of the text you’re analyzing—it will still feel difficult to produce insightful analysis. 

If you’d like to sharpen the tools in your literature toolbox, there are plenty of other resources to help you do so: 

  • Check out our expansive library of Literary Devices . These could provide you with a deeper understanding of the basic devices discussed above or introduce you to new concepts sure to impress your professors ( anagnorisis , anyone?). 
  • This Academic Citation Resource Guide ensures you properly cite any work you reference in your analytical essay. 
  • Our English Homework Help Guide will point you to dozens of resources that can help you perform analysis, from critical reading strategies to poetry helpers. 
  • This Grammar Education Resource Guide will direct you to plenty of resources to refine your grammar and writing (definitely important for getting an A+ on that paper). 

Of course, you should know the text inside and out before you begin writing your analysis. In order to develop a true understanding of the work, read through its corresponding SuperSummary study guide . Doing so will help you truly comprehend the plot, as well as provide some inspirational ideas for your analysis.

writing text analysis

writing text analysis

Core Description and Guidelines

Sample Policy Statements

Five Ways of Reading

Text Analysis Paper Assignments

Groupwork and Other In-Class Activities

Study Questions

Alternative Assignments

Tips from the Trenches

Sample Exams

Materials Grouped by Instructor

Text Analysis

Below are sample text analysis assignments:

  • Text Analysis Papers Description "Handout"
  • Sample Essay Assignment "Two Options"
  • Sample Assignment "Novel Response" (Kennedy)
  • Sample Essay Assignment "Literary Analysis Paper: Critical Comparison of Short Fiction" (Kennedy)
  • Sample Essay Assignment "Literary Analysis Paper: Constructing a Canon" (Kennedy)
  • Basic Text Statistics
  • Common Words and Phrases
  • Readability
  • Lexical Density
  • Passive Voice

logo that says helpful professor with a mortarboard hat picture next to it

Textual Analysis: Definition, Types & 10 Examples

textual analysis example and definition, explained below

Textual analysis is a research methodology that involves exploring written text as empirical data. Scholars explore both the content and structure of texts, and attempt to discern key themes and statistics emergent from them.

This method of research is used in various academic disciplines, including cultural studies, literature, bilical studies, anthropology , sociology, and others (Dearing, 2022; McKee, 2003).

This method of analysis involves breaking down a text into its constituent parts for close reading and making inferences about its context, underlying themes, and the intentions of its author.

Textual Analysis Definition

Alan McKee is one of the preeminent scholars of textual analysis. He provides a clear and approachable definition in his book Textual Analysis: A Beginner’s Guide (2003) where he writes:

“When we perform textual analysis on a text we make an educated guess at some of the most likely interpretations that might be made of the text […] in order to try and obtain a sense of the ways in which, in particular cultures at particular times, people make sense of the world around them.”

A key insight worth extracting from this definition is that textual analysis can reveal what cultural groups value, how they create meaning, and how they interpret reality.

This is invaluable in situations where scholars are seeking to more deeply understand cultural groups and civilizations – both past and present (Metoyer et al., 2018).

As such, it may be beneficial for a range of different types of studies, such as:

  • Studies of Historical Texts: A study of how certain concepts are framed, described, and approached in historical texts, such as the Bible.
  • Studies of Industry Reports: A study of how industry reports frame and discuss concepts such as environmental and social responsibility.
  • Studies of Literature: A study of how a particular text or group of texts within a genre define and frame concepts. For example, you could explore how great American literature mythologizes the concept of the ‘The American Dream’.
  • Studies of Speeches: A study of how certain politicians position national identities in their appeals for votes.
  • Studies of Newspapers: A study of the biases within newspapers toward or against certain groups of people.
  • Etc. (For more, see: Dearing, 2022)

McKee uses the term ‘textual analysis’ to also refer to text types that are not just written, but multimodal. For a dive into the analysis of multimodal texts, I recommend my article on content analysis , where I explore the study of texts like television advertisements and movies in detail.

Features of a Textual Analysis

When conducting a textual analysis, you’ll need to consider a range of factors within the text that are worthy of close examination to infer meaning. Features worthy of considering include:

  • Content: What is being said or conveyed in the text, including explicit and implicit meanings, themes, or ideas.
  • Context: When and where the text was created, the culture and society it reflects, and the circumstances surrounding its creation and distribution.
  • Audience: Who the text is intended for, how it’s received, and the effect it has on its audience.
  • Authorship: Who created the text, their background and perspectives, and how these might influence the text.
  • Form and structure: The layout, sequence, and organization of the text and how these elements contribute to its meanings (Metoyer et al., 2018).

Textual Analysis Coding Methods

The above features may be examined through quantitative or qualitative research designs , or a mixed-methods angle.

1. Quantitative Approaches

You could analyze several of the above features, namely, content, form, and structure, from a quantitative perspective using computational linguistics and natural language processing (NLP) analysis.

From this approach, you would use algorithms to extract useful information or insights about frequency of word and phrase usage, etc. This can include techniques like sentiment analysis, topic modeling, named entity recognition, and more.

2. Qualitative Approaches

In many ways, textual analysis lends itself best to qualitative analysis. When identifying words and phrases, you’re also going to want to look at the surrounding context and possibly cultural interpretations of what is going on (Mayring, 2015).

Generally, humans are far more perceptive at teasing out these contextual factors than machines (although, AI is giving us a run for our money).

One qualitative approach to textual analysis that I regularly use is inductive coding, a step-by-step methodology that can help you extract themes from texts. If you’re interested in using this step-by-step method, read my guide on inductive coding here .

See more Qualitative Research Approaches Here

Textual Analysis Examples

Title: “Discourses on Gender, Patriarchy and Resolution 1325: A Textual Analysis of UN Documents”  Author: Nadine Puechguirbal Year: 2010 APA Citation: Puechguirbal, N. (2010). Discourses on Gender, Patriarchy and Resolution 1325: A Textual Analysis of UN Documents, International Peacekeeping, 17 (2): 172-187. doi: 10.1080/13533311003625068

Summary: The article discusses the language used in UN documents related to peace operations and analyzes how it perpetuates stereotypical portrayals of women as vulnerable individuals. The author argues that this language removes women’s agency and keeps them in a subordinate position as victims, instead of recognizing them as active participants and agents of change in post-conflict environments. Despite the adoption of UN Security Council Resolution 1325, which aims to address the role of women in peace and security, the author suggests that the UN’s male-dominated power structure remains unchallenged, and gender mainstreaming is often presented as a non-political activity.

Title: “Racism and the Media: A Textual Analysis”  Author: Kassia E. Kulaszewicz Year: 2015 APA Citation: Kulaszewicz, K. E. (2015). Racism and the Media: A Textual Analysis . Dissertation. Retrieved from:

Summary: This study delves into the significant role media plays in fostering explicit racial bias. Using Bandura’s Learning Theory, it investigates how media content influences our beliefs through ‘observational learning’. Conducting a textual analysis, it finds differences in representation of black and white people, stereotyping of black people, and ostensibly micro-aggressions toward black people. The research highlights how media often criminalizes Black men, portraying them as violent, while justifying or supporting the actions of White officers, regardless of their potential criminality. The study concludes that news media likely continues to reinforce racism, whether consciously or unconsciously.

Title: “On the metaphorical nature of intellectual capital: a textual analysis” Author: Daniel Andriessen Year: 2006 APA Citation: Andriessen, D. (2006). On the metaphorical nature of intellectual capital: a textual analysis. Journal of Intellectual capital , 7 (1), 93-110.

Summary: This article delves into the metaphorical underpinnings of intellectual capital (IC) and knowledge management, examining how knowledge is conceptualized through metaphors. The researchers employed a textual analysis methodology, scrutinizing key texts in the field to identify prevalent metaphors. They found that over 95% of statements about knowledge are metaphor-based, with “knowledge as a resource” and “knowledge as capital” being the most dominant. This study demonstrates how textual analysis helps us to understand current understandings and ways of speaking about a topic.

Title: “Race in Rhetoric: A Textual Analysis of Barack Obama’s Campaign Discourse Regarding His Race” Author: Andrea Dawn Andrews Year: 2011 APA Citation: Andrew, A. D. (2011) Race in Rhetoric: A Textual Analysis of Barack Obama’s Campaign Discourse Regarding His Race. Undergraduate Honors Thesis Collection. 120 .

This undergraduate honors thesis is a textual analysis of Barack Obama’s speeches that explores how Obama frames the concept of race. The student’s capstone project found that Obama tended to frame racial inequality as something that could be overcome, and that this was a positive and uplifting project. Here, the student breaks-down times when Obama utilizes the concept of race in his speeches, and examines the surrounding content to see the connotations associated with race and race-relations embedded in the text. Here, we see a decidedly qualitative approach to textual analysis which can deliver contextualized and in-depth insights.

Sub-Types of Textual Analysis

While above I have focused on a generalized textual analysis approach, a range of sub-types and offshoots have emerged that focus on specific concepts, often within their own specific theoretical paradigms. Each are outlined below, and where I’ve got a guide, I’ve linked to it in blue:

  • Content Analysis : Content analysis is similar to textual analysis, and I would consider it a type of textual analysis, where it’s got a broader understanding of the term ‘text’. In this type, a text is any type of ‘content’, and could be multimodal in nature, such as television advertisements, movies, posters, and so forth. Content analysis can be both qualitative and quantitative, depending on whether it focuses more on the meaning of the content or the frequency of certain words or concepts (Chung & Pennebaker, 2018).
  • Discourse Analysis : Emergent specifically from critical and postmodern/ poststructural theories, discourse analysis focuses closely on the use of language within a social context, with the goal of revealing how repeated framing of terms and concepts has the effect of shaping how cultures understand social categories. It considers how texts interact with and shape social norms, power dynamics, ideologies, etc. For example, it might examine how gender is socially constructed as a distinct social category through Disney films. It may also be called ‘critical discourse analysis’.
  • Narrative Analysis: This approach is used for analyzing stories and narratives within text. It looks at elements like plot, characters, themes, and the sequence of events to understand how narratives construct meaning.
  • Frame Analysis: This approach looks at how events, ideas, and themes are presented or “framed” within a text. It explores how these frames can shape our understanding of the information being presented. While similar to discourse analysis, a frame analysis tends to be less associated with the loaded concept of ‘discourse’ that exists specifically within postmodern paradigms (Smith, 2017).
  • Semiotic Analysis: This approach studies signs and symbols, both visual and textual, and could be a good compliment to a content analysis, as it provides the language and understandings necessary to describe how signs make meaning in cultural contexts that we might find with the fields of semantics and pragmatics . It’s based on the theory of semiotics, which is concerned with how meaning is created and communicated through signs and symbols.
  • Computational Textual Analysis: In the context of data science or artificial intelligence, this type of analysis involves using algorithms to process large amounts of text. Techniques can include topic modeling, sentiment analysis, word frequency analysis, and others. While being extremely useful for a quantitative analysis of a large dataset of text, it falls short in its ability to provide deep contextualized understandings of words-in-context.

Each of these methods has its strengths and weaknesses, and the choice of method depends on the research question, the type of text being analyzed, and the broader context of the research.

See More Examples of Analysis Here

Strengths and Weaknesses of Textual Analysis

When writing your methodology for your textual analysis, make sure to define not only what textual analysis is, but (if applicable) the type of textual analysis, the features of the text you’re analyzing, and the ways you will code the data. It’s also worth actively reflecting on the potential weaknesses of a textual analysis approach, but also explaining why, despite those weaknesses, you believe this to be the most appropriate methodology for your study.

Chung, C. K., & Pennebaker, J. W. (2018). Textual analysis. In  Measurement in social psychology  (pp. 153-173). Routledge.

Dearing, V. A. (2022).  Manual of textual analysis . Univ of California Press.

McKee, A. (2003). Textual analysis: A beginner’s guide.  Textual analysis , 1-160.

Mayring, P. (2015). Qualitative content analysis: Theoretical background and procedures.  Approaches to qualitative research in mathematics education: Examples of methodology and methods , 365-380. doi:

Metoyer, R., Zhi, Q., Janczuk, B., & Scheirer, W. (2018, March). Coupling story to visualization: Using textual analysis as a bridge between data and interpretation. In  23rd International Conference on Intelligent User Interfaces  (pp. 503-507). doi:

Smith, J. A. (2017). Textual analysis.  The international encyclopedia of communication research methods , 1-7.


Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

  • Chris Drew (PhD) 50 Durable Goods Examples
  • Chris Drew (PhD) 100 Consumer Goods Examples
  • Chris Drew (PhD) 30 Globalization Pros and Cons
  • Chris Drew (PhD) 17 Adversity Examples (And How to Overcome Them)

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *


What is Text Analysis? A Beginner’s Guide

  • How Does It work?
  • Use Cases & Applications
  • Online Tools

Introduction to Text Analysis

If you receive huge amounts of unstructured data in the form of text (emails, social media conversations, chats), you’re probably aware of the challenges that come with analyzing this data.

Manually processing and organizing text data takes time, it’s tedious, inaccurate, and it can be expensive if you need to hire extra staff to sort through text.

Automate text analysis with a no-code tool

In this guide, learn more about what text analysis is, how to perform text analysis using AI tools, and why it’s more important than ever to automatically analyze your text in real time.

  • Text Analysis Basics
  • Methods & Techniques

How Does Text Analysis Work?

How to analyze text data.

  • Use Cases and Applications
  • Tools and Resources

What Is Text Analysis?

Introduction to Text Analysis

Text analysis (TA) is a machine learning technique used to automatically extract valuable insights from unstructured text data. Companies use text analysis tools to quickly digest online data and documents, and transform them into actionable insights.

You can us text analysis to extract specific information, like keywords, names, or company information from thousands of emails, or categorize survey responses by sentiment and topic.

The Text Analysis vs. Text Mining vs. Text Analytics

Firstly, let's dispel the myth that text mining and text analysis are two different processes. The terms are often used interchangeably to explain the same process of obtaining data through statistical pattern learning. To avoid any confusion here, let's stick to text analysis.

So, text analytics vs. text analysis : what's the difference?

Text analysis delivers qualitative results and text analytics delivers quantitative results. If a machine performs text analysis, it identifies important information within the text itself, but if it performs text analytics, it reveals patterns across thousands of texts, resulting in graphs, reports, tables etc.

Let's say a customer support manager wants to know how many support tickets were solved by individual team members. In this instance, they'd use text analytics to create a graph that visualizes individual ticket resolution rates.

However, it's likely that the manager also wants to know which proportion of tickets resulted in a positive or negative outcome?

By analyzing the text within each ticket, and subsequent exchanges, customer support managers can see how each agent handled tickets, and whether customers were happy with the outcome.

Basically, the challenge in text analysis is decoding the ambiguity of human language, while in text analytics it's detecting patterns and trends from the numerical results.

Why Is Text Analysis Important?

When you put machines to work on organizing and analyzing your text data, the insights and benefits are huge.

Let's take a look at some of the advantages of text analysis, below:

Text Analysis Is Scalable

Text analysis tools allow businesses to structure vast quantities of information, like emails, chats, social media, support tickets, documents, and so on, in seconds rather than days, so you can redirect extra resources to more important business tasks.

Analyze Text in Real-time

Businesses are inundated with information and customer comments can appear anywhere on the web these days, but it can be difficult to keep an eye on it all. Text analysis is a game-changer when it comes to detecting urgent matters, wherever they may appear, 24/7 and in real time. By training text analysis models to detect expressions and sentiments that imply negativity or urgency, businesses can automatically flag tweets, reviews, videos, tickets, and the like, and take action sooner rather than later.

AI Text Analysis Delivers Consistent Criteria

Humans make errors. Fact. And the more tedious and time-consuming a task is, the more errors they make. By training text analysis models to your needs and criteria, algorithms are able to analyze, understand, and sort through data much more accurately than humans ever could.

Text Analysis Methods & Techniques


There are basic and more advanced text analysis techniques, each used for different purposes. First, learn about the simpler text analysis techniques and examples of when you might use each one.

Text Classification

Text extraction, word frequency, collocation, concordance, word sense disambiguation.

Text classification is the process of assigning predefined tags or categories to unstructured text. It's considered one of the most useful natural language processing techniques because it's so versatile and can organize, structure, and categorize pretty much any form of text to deliver meaningful data and solve problems. Natural language processing (NLP) is a machine learning technique that allows computers to break down and understand text much as a human would.

Below, we're going to focus on some of the most common text classification tasks, which include sentiment analysis, topic modeling, language detection, and intent detection.

Sentiment Analysis

Customers freely leave their opinions about businesses and products in customer service interactions, on surveys, and all over the internet. Sentiment analysis uses powerful machine learning algorithms to automatically read and classify for opinion polarity (positive, negative, neutral) and beyond, into the feelings and emotions of the writer, even context and sarcasm.

For example, by using sentiment analysis companies are able to flag complaints or urgent requests, so they can be dealt with immediately – even avert a PR crisis on social media . Sentiment classifiers can assess brand reputation, carry out market research, and help improve products with customer feedback.

Try out MonkeyLearn's pre-trained classifier . Just enter your own text to see how it works:

Test with your own text

Topic analysis.

Another common example of text classification is topic analysis (or topic modeling ) that automatically organizes text by subject or theme. For example:

“The app is really simple and easy to use”

If we are using topic categories, like Pricing, Customer Support, and Ease of Use, this product feedback would be classified under Ease of Use .

Try out MonkeyLearn's pre-trained topic classifier , which can be used to categorize NPS responses for SaaS products.

Intent Detection

Text classifiers can also be used to detect the intent of a text. Intent detection or intent classification is often used to automatically understand the reason behind customer feedback. Is it a complaint? Or is a customer writing with the intent to purchase a product? Machine learning can read chatbot conversations or emails and automatically route them to the proper department or employee.

Try out MonkeyLearn's email intent classifier .

Text extraction is another widely used text analysis technique that extracts pieces of data that already exist within any given text. You can extract things like keywords, prices, company names, and product specifications from news reports, product reviews, and more.

You can automatically populate spreadsheets with this data or perform extraction in concert with other text analysis techniques to categorize and extract data at the same time.

Keyword Extraction

Keywords are the most used and most relevant terms within a text, words and phrases that summarize the contents of text. [Keyword extraction] (]( ) can be used to index data to be searched and to generate word clouds (a visual representation of text data).

Try out MonkeyLearn's pre-trained keyword extractor to see how it works. Just type in your text below:

Entity Recognition

A named entity recognition (NER) extractor finds entities, which can be people, companies, or locations and exist within text data. Results are shown labeled with the corresponding entity label, like in MonkeyLearn's pre-trained name extractor :

Word frequency is a text analysis technique that measures the most frequently occurring words or concepts in a given text using the numerical statistic TF-IDF (term frequency-inverse document frequency).

You might apply this technique to analyze the words or expressions customers use most frequently in support conversations. For example, if the word 'delivery' appears most often in a set of negative support tickets, this might suggest customers are unhappy with your delivery service.

Collocation helps identify words that commonly co-occur. For example, in customer reviews on a hotel booking website, the words 'air' and 'conditioning' are more likely to co-occur rather than appear individually. Bigrams (two adjacent words e.g. 'air conditioning' or 'customer support') and trigrams (three adjacent words e.g. 'out of office' or 'to be continued') are the most common types of collocation you'll need to look out for.

Collocation can be helpful to identify hidden semantic structures and improve the granularity of the insights by counting bigrams and trigrams as one word.

Concordance helps identify the context and instances of words or a set of words. For example, the following is the concordance of the word “simple” in a set of app reviews:

Concordance Example

In this case, the concordance of the word “simple” can give us a quick grasp of how reviewers are using this word. It can also be used to decode the ambiguity of the human language to a certain extent, by looking at how words are used in different contexts, as well as being able to analyze more complex phrases.

It's very common for a word to have more than one meaning, which is why word sense disambiguation is a major challenge of natural language processing. Take the word 'light' for example. Is the text referring to weight, color, or an electrical appliance? Smart text analysis with word sense disambiguation can differentiate words that have more than one meaning, but only after training models to do so.

Text clusters are able to understand and group vast quantities of unstructured data. Although less accurate than classification algorithms, clustering algorithms are faster to implement, because you don't need to tag examples to train models. That means these smart algorithms mine information and make predictions without the use of training data, otherwise known as unsupervised machine learning.

Google is a great example of how clustering works. When you search for a term on Google, have you ever wondered how it takes just seconds to pull up relevant results? Google's algorithm breaks down unstructured data from web pages and groups pages into clusters around a set of similar words or n-grams (all possible combinations of adjacent words or letters in a text). So, the pages from the cluster that contain a higher count of words or n-grams relevant to the search query will appear first within the results.

How does Text Analysis work?

To really understand how automated text analysis works, you need to understand the basics of machine learning . Let's start with this definition from Machine Learning by Tom Mitchell :

"A computer program is said to learn to perform a task T from experience E".

In other words, if we want text analysis software to perform desired tasks, we need to teach machine learning algorithms how to analyze, understand and derive meaning from text. But how? The simple answer is by tagging examples of text. Once a machine has enough examples of tagged text to work with, algorithms are able to start differentiating and making associations between pieces of text, and make predictions by themselves.

It's very similar to the way humans learn how to differentiate between topics, objects, and emotions. Let's say we have urgent and low priority issues to deal with. We don't instinctively know the difference between them – we learn gradually by associating urgency with certain expressions.

For example, when we want to identify urgent issues, we'd look out for expressions like 'please help me ASAP!' or 'urgent: can't enter the platform, the system is DOWN!!' . On the other hand, to identify low priority issues, we'd search for more positive expressions like 'thanks for the help! Really appreciate it' or 'the new feature works like a dream' .

Text analysis can stretch it's AI wings across a range of texts depending on the results you desire. It can be applied to:

  • Whole documents : obtains information from a complete document or paragraph: e.g., the overall sentiment of a customer review.
  • Single sentences : obtains information from specific sentences: e.g., more detailed sentiments of every sentence of a customer review.
  • Sub-sentences : obtains information from sub-expressions within a sentence: e.g., the underlying sentiments of every opinion unit of a customer review.

Once you know how you want to break up your data, you can start analyzing it.

Let’s take a look at how text analysis works, step-by-step, and go into more detail about the different machine learning algorithms and techniques available.

Data Gathering

You can gather data about your brand, product or service from both internal and external sources:

Internal Data

This is the data you generate every day, from emails and chats, to surveys, customer queries, and customer support tickets.

You just need to export it from your software or platform as a CSV or Excel file, or connect an API to retrieve it directly.

Some examples of internal data:

Customer Service Software : the software you use to communicate with customers, manage user queries and deal with customer support issues: Zendesk, Freshdesk, and Help Scout are a few examples.

CRM : software that keeps track of all the interactions with clients or potential clients. It can involve different areas, from customer support to sales and marketing. Hubspot, Salesforce, and Pipedrive are examples of CRMs.

Chat : apps that communicate with the members of your team or your customers, like Slack, Hipchat, Intercom, and Drift.

Email : the king of business communication, emails are still the most popular tool to manage conversations with customers and team members.

Surveys : generally used to gather customer service feedback, product feedback, or to conduct market research, like Typeform, Google Forms, and SurveyMonkey.

NPS (Net Promoter Score) : one of the most popular metrics for customer experience in the world. Many companies use NPS tracking software to collect and analyze feedback from their customers. A few examples are Delighted, and Satismeter.

Databases : a database is a collection of information. By using a database management system, a company can store, manage and analyze all sorts of data. Examples of databases include Postgres, MongoDB, and MySQL.

Product Analytics : the feedback and information about interactions of a customer with your product or service. It's useful to understand the customer's journey and make data-driven decisions. ProductBoard and UserVoice are two tools you can use to process product analytics.

External Data

This is text data about your brand or products from all over the web. You can use web scraping tools, APIs, and open datasets to collect external data from social media, news reports, online reviews, forums, and more, and analyze it with machine learning models.

Web Scraping Tools:

Visual Web Scraping Tools : you can build your own web scraper even with no coding experience, with tools like., Portia, and ParseHub.e.

Web Scraping Frameworks : seasoned coders can benefit from tools, like Scrapy in Python and Wombat in Ruby, to create custom scrapers.

Facebook, Twitter, and Instagram, for example, have their own APIs and allow you to extract data from their platforms. Major media outlets like the New York Times or The Guardian also have their own APIs and you can use them to search their archive or gather users' comments, among other things.


SaaS tools, like MonkeyLearn offer integrations with the tools you already use . You can connect directly to Twitter , Google Sheets , Gmail, Zendesk, SurveyMonkey, Rapidminer, and more. And perform text analysis on Excel data by uploading a file.

2. Data Preparation

In order to automatically analyze text with machine learning, you’ll need to organize your data. Most of this is done automatically, and you won't even notice it's happening. However, it's important to understand that automatic text analysis makes use of a number of natural language processing techniques (NLP) like the below.

Tokenization, Part-of-speech Tagging, and Parsing

Tokenization is the process of breaking up a string of characters into semantically meaningful parts that can be analyzed (e.g., words), while discarding meaningless chunks (e.g. whitespaces).

The examples below show two different ways in which one could tokenize the string 'Analyzing text is not that hard' .

(Incorrect): Analyzing text is not that hard. = [“Analyz”, “ing text”, “is n”, “ot that”, “hard.”]

(Correct): Analyzing text is not that hard. = [“Analyzing”, “text”, “is”, “not”, “that”, “hard”, “.”]

Once the tokens have been recognized, it's time to categorize them. Part-of-speech tagging refers to the process of assigning a grammatical category, such as noun, verb, etc. to the tokens that have been detected.

Here are the PoS tags of the tokens from the sentence above:

“Analyzing”: VERB, “text”: NOUN, “is”: VERB, “not”: ADV, “that”: ADV, “hard”: ADJ, “.”: PUNCT

With all the categorized tokens and a language model (i.e. a grammar), the system can now create more complex representations of the texts it will analyze. This process is known as parsing . In other words, parsing refers to the process of determining the syntactic structure of a text. To do this, the parsing algorithm makes use of a grammar of the language the text has been written in. Different representations will result from the parsing of the same text with different grammars.

The examples below show the dependency and constituency representations of the sentence 'Analyzing text is not that hard' .

Dependency Parsing

Dependency grammars can be defined as grammars that establish directed relations between the words of sentences. Dependency parsing is the process of using a dependency grammar to determine the syntactic structure of a sentence:

Dependency Parsing

Constituency Parsing

Constituency phrase structure grammars model syntactic structures by making use of abstract nodes associated to words and other abstract categories (depending on the type of grammar) and undirected relations between them. Constituency parsing refers to the process of using a constituency grammar to determine the syntactic structure of a sentence:

Constituency Parsing

As you can see in the images above, the output of the parsing algorithms contains a great deal of information which can help you understand the syntactic (and some of the semantic) complexity of the text you intend to analyze.

Depending on the problem at hand, you might want to try different parsing strategies and techniques. However, at present, dependency parsing seems to outperform other approaches.

Lemmatization and Stemming

Stemming and lemmatization both refer to the process of removing all of the affixes (i.e. suffixes, prefixes, etc.) attached to a word in order to keep its lexical base, also known as root or stem or its dictionary form or le mma . The main difference between these two processes is that stemming is usually based on rules that trim word beginnings and endings (and sometimes lead to somewhat weird results), whereas lemmatization makes use of dictionaries and a much more complex morphological analysis.

The table below shows the output of NLTK's Snowball Stemmer and Spacy's lemmatizer for the tokens in the sentence 'Analyzing text is not that hard' . The differences in the output have been boldfaced:

NLTK's Snowball Stemmer and Spacy's lemmatizer

Stopword Removal

To provide a more accurate automated analysis of the text, we need to remove the words that provide very little semantic information or no meaning at all. These words are also known as stopwords: a, and, or, the, etc.

There are many different lists of stopwords for every language. However, it's important to understand that you might need to add words to or remove words from those lists depending on the texts you want to analyze and the analyses you would like to perform.

You might want to do some kind of lexical analysis of the domain your texts come from in order to determine the words that should be added to the stopwords list.

Analyze Your Text Data

Now that you’ve learned how to mine unstructured text data and the basics of data preparation, how do you analyze all of this text?

Well, the analysis of unstructured text is not straightforward. There are countless text analysis methods, but two of the main techniques are text classification and text extraction .

Text classification (also known as text categorization or text tagging ) refers to the process of assigning tags to texts based on its content.

In the past, text classification was done manually, which was time-consuming, inefficient, and inaccurate. But automated machine learning text analysis models often work in just seconds with unsurpassed accuracy.

The most popular text classification tasks include sentiment analysis (i.e. detecting when a text says something positive or negative about a given topic), topic detection (i.e. determining what topics a text talks about), and intent detection (i.e. detecting the purpose or underlying intent of the text), among others, but there are a great many more applications you might be interested in.

Rule-based Systems

In text classification, a rule is essentially a human-made association between a linguistic pattern that can be found in a text and a tag. Rules usually consist of references to morphological, lexical, or syntactic patterns, but they can also contain references to other components of language, such as semantics or phonology.

Here's an example of a simple rule for classifying product descriptions according to the type of product described in the text:

(HDD|RAM|SSD|Memory) → Hardware

In this case, the system will assign the Hardware tag to those texts that contain the words HDD , RAM , SSD , or Memory .

The most obvious advantage of rule-based systems is that they are easily understandable by humans. However, creating complex rule-based systems takes a lot of time and a good deal of knowledge of both linguistics and the topics being dealt with in the texts the system is supposed to analyze.

On top of that, rule-based systems are difficult to scale and maintain because adding new rules or modifying the existing ones requires a lot of analysis and testing of the impact of these changes on the results of the predictions.

Machine Learning-based Systems

Machine learning-based systems can make predictions based on what they learn from past observations. These systems need to be fed multiple examples of texts and the expected predictions (tags) for each. This is called training data . The more consistent and accurate your training data, the better ultimate predictions will be.

When you train a machine learning-based classifier, training data has to be transformed into something a machine can understand, that is, vectors (i.e. lists of numbers which encode information). By using vectors, the system can extract relevant features (pieces of information) which will help it learn from the existing data and make predictions about the texts to come.

There are a number of ways to do this, but one of the most frequently used is called bag of words vectorization . You can learn more about vectorization here .

Once the texts have been transformed into vectors, they are fed into a machine learning algorithm together with their expected output to create a classification model that can choose what features best represent the texts and make predictions about unseen texts:

Creating the Classification Model

The trained model will transform unseen text into a vector, extract its relevant features, and make a prediction:

Predicting data with the Classification Model

Machine Learning Algorithms

There are many machine learning algorithms used in text classification. The most frequently used are the Naive Bayes (NB) family of algorithms, Support Vector Machines (SVM), and deep learning algorithms.

The Naive Bayes family of algorithms is based on Bayes's Theorem and the conditional probabilities of occurrence of the words of a sample text within the words of a set of texts that belong to a given tag. Vectors that represent texts encode information about how likely it is for the words in the text to occur in the texts of a given tag. With this information, the probability of a text's belonging to any given tag in the model can be computed. Once all of the probabilities have been computed for an input text, the classification model will return the tag with the highest probability as the output for that input.

One of the main advantages of this algorithm is that results can be quite good even if there’s not much training data.

Support Vector Machines (SVM) is an algorithm that can divide a vector space of tagged texts into two subspaces: one space that contains most of the vectors that belong to a given tag and another subspace that contains most of the vectors that do not belong to that one tag.

Classification models that use SVM at their core will transform texts into vectors and will determine what side of the boundary that divides the vector space for a given tag those vectors belong to. Based on where they land, the model will know if they belong to a given tag or not.

The most important advantage of using SVM is that results are usually better than those obtained with Naive Bayes. However, more computational resources are needed for SVM.

Deep Learning is a set of algorithms and techniques that use “artificial neural networks” to process data much as the human brain does. These algorithms use huge amounts of training data (millions of examples) to generate semantically rich representations of texts which can then be fed into machine learning-based models of different kinds that will make much more accurate predictions than traditional machine learning models:

Deep Learning vs Traditional Machine Learning algorithms

Hybrid Systems

Hybrid systems usually contain machine learning-based systems at their cores and rule-based systems to improve the predictions

Classifier performance is usually evaluated through standard metrics used in the machine learning field: accuracy , precision , recall , and F1 score . Understanding what they mean will give you a clearer idea of how good your classifiers are at analyzing your texts.

It is also important to understand that evaluation can be performed over a fixed testing set (i.e. a set of texts for which we know the expected output tags) or by using cross-validation (i.e. a method that splits your training data into different folds so that you can use some subsets of your data for training purposes and some for testing purposes, see below ).

Accuracy, Precision, Recall, and F1 score

Accuracy is the number of correct predictions the classifier has made divided by the total number of predictions. In general, accuracy alone is not a good indicator of performance. For example, when categories are imbalanced, that is, when there is one category that contains many more examples than all of the others, predicting all texts as belonging to that category will return high accuracy levels. This is known as the accuracy paradox . To get a better idea of the performance of a classifier, you might want to consider precision and recall instead.

Precision states how many texts were predicted correctly out of the ones that were predicted as belonging to a given tag. In other words, precision takes the number of texts that were correctly predicted as positive for a given tag and divides it by the number of texts that were predicted (correctly and incorrectly) as belonging to the tag.

We have to bear in mind that precision only gives information about the cases where the classifier predicts that the text belongs to a given tag. This might be particularly important, for example, if you would like to generate automated responses for user messages. In this case, before you send an automated response you want to know for sure you will be sending the right response, right? In other words, if your classifier says the user message belongs to a certain type of message, you would like the classifier to make the right guess. This means you would like a high precision for that type of message.

Recall states how many texts were predicted correctly out of the ones that should have been predicted as belonging to a given tag. In other words, recall takes the number of texts that were correctly predicted as positive for a given tag and divides it by the number of texts that were either predicted correctly as belonging to the tag or that were incorrectly predicted as not belonging to the tag.

Recall might prove useful when routing support tickets to the appropriate team, for example. It might be desired for an automated system to detect as many tickets as possible for a critical tag (for example tickets about 'Outrages / Downtime' ) at the expense of making some incorrect predictions along the way. In this case, making a prediction will help perform the initial routing and solve most of these critical issues ASAP. If the prediction is incorrect, the ticket will get rerouted by a member of the team. When processing thousands of tickets per week, high recall (with good levels of precision as well, of course) can save support teams a good deal of time and enable them to solve critical issues faster.

The F1 score is the harmonic means of precision and recall. It tells you how well your classifier performs if equal importance is given to precision and recall. In general, F1 score is a much better indicator of classifier performance than accuracy is.


Cross-validation is quite frequently used to evaluate the performance of text classifiers. The method is simple. First of all, the training dataset is randomly split into a number of equal-length subsets (e.g. 4 subsets with 25% of the original data each). Then, all the subsets except for one are used to train a classifier (in this case, 3 subsets with 75% of the original data) and this classifier is used to predict the texts in the remaining subset. Next, all the performance metrics are computed (i.e. accuracy, precision, recall, F1, etc.). Finally, the process is repeated with a new testing fold until all the folds have been used for testing purposes.

Once all folds have been used, the average performance metrics are computed and the evaluation process is finished.

Text Extraction refers to the process of recognizing structured pieces of information from unstructured text. For example, it can be useful to automatically detect the most relevant keywords from a piece of text, identify names of companies in a news article, detect lessors and lessees in a financial contract, or identify prices on product descriptions.

Regular Expressions

Regular Expressions (a.k.a. regexes) work as the equivalent of the rules defined in classification tasks. In this case, a regular expression defines a pattern of characters that will be associated with a tag.

For example, the pattern below will detect most email addresses in a text if they preceded and followed by spaces:

(?i)\b(?: [a-zA-Z0-9_ - .] +)@(?:(?: [ [0-9] {1,3} . [0-9] {1,3} . [0-9] {1,3} . )|(?:(?: [a-zA-Z0-9 -] + . )+))(?: [a-zA-Z] {2,4}| [0-9] {1,3})(?: ] ?)\b

By detecting this match in texts and assigning it the email tag, we can create a rudimentary email address extractor.

There are obvious pros and cons of this approach. On the plus side, you can create text extractors quickly and the results obtained can be good, provided you can find the right patterns for the type of information you would like to detect. On the minus side, regular expressions can get extremely complex and might be really difficult to maintain and scale, particularly when many expressions are needed in order to extract the desired patterns.

Conditional Random Fields

Conditional Random Fields (CRF) is a statistical approach often used in machine-learning-based text extraction. This approach learns the patterns to be extracted by weighing a set of features of the sequences of words that appear in a text. Through the use of CRFs, we can add multiple variables which depend on each other to the patterns we use to detect information in texts, such as syntactic or semantic information.

This usually generates much richer and complex patterns than using regular expressions and can potentially encode much more information. However, more computational resources are needed in order to implement it since all the features have to be calculated for all the sequences to be considered and all of the weights assigned to those features have to be learned before determining whether a sequence should belong to a tag or not.

One of the main advantages of the CRF approach is its generalization capacity. Once an extractor has been trained using the CRF approach over texts of a specific domain, it will have the ability to generalize what it has learned to other domains reasonably well.

Extractors are sometimes evaluated by calculating the same standard performance metrics we have explained above for text classification, namely, accuracy , precision , recall , and F1 score . However, these metrics do not account for partial matches of patterns. In order for an extracted segment to be a true positive for a tag, it has to be a perfect match with the segment that was supposed to be extracted.

Consider the following example:

'Your flight will depart on January 14, 2020 at 03:30 PM from SFO'

If we created a date extractor, we would expect it to return January 14, 2020 as a date from the text above, right? So, if the output of the extractor were January 14, 2020, we would count it as a true positive for the tag DATE .

But, what if the output of the extractor were January 14? Would you say the extraction was bad? Would you say it was a false positive for the tag DATE ? To capture partial matches like this one, some other performance metrics can be used to evaluate the performance of extractors. One example of this is the ROUGE family of metrics.

ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a family of metrics used in the fields of machine translation and automatic summarization that can also be used to assess the performance of text extractors. These metrics basically compute the lengths and number of sequences that overlap between the source text (in this case, our original text) and the translated or summarized text (in this case, our extraction).

Depending on the length of the units whose overlap you would like to compare, you can define ROUGE-n metrics (for units of length n ) or you can define the ROUGE-LCS or ROUGE-L metric if you intend to compare the longest common sequence (LCS).

4.Visualize Your Text Data

Now you know a variety of text analysis methods to break down your data, but what do you do with the results? Business intelligence (BI) and data visualization tools make it easy to understand your results in striking dashboards.

  • MonkeyLearn Studio

MonkeyLearn Studio is an all-in-one data gathering, analysis, and visualization tool. Deep learning machine learning techniques allow you to choose the text analyses you need (keyword extraction, sentiment analysis, aspect classification, and on and on) and chain them together to work simultaneously.

You’ll see the importance of text analytics right away. Simply upload your data and visualize the results for powerful insights. It all works together in a single interface, so you no longer have to upload and download between applications.

  • Google Data Studio

Google's free visualization tool allows you to create interactive reports using a wide variety of data. Once you've imported your data you can use different tools to design your report and turn your data into an impressive visual story. Share the results with individuals or teams, publish them on the web, or embed them on your website.

Looker is a business data analytics platform designed to direct meaningful data to anyone within a company. The idea is to allow teams to have a bigger picture about what's happening in their company.

You can connect to different databases and automatically create data models, which can be fully customized to meet specific needs. Take a look here to get started.

Tableau is a business intelligence and data visualization tool with an intuitive, user-friendly approach (no technical skills required). Tableau allows organizations to work with almost any existing data source and provides powerful visualization options with more advanced tools for developers.

There's a trial version available for anyone wanting to give it a go. Learn how to perform text analysis in Tableau .

Text Analysis Applications & Examples

Text Analysis Use Cases and Applications

Did you know that 80% of business data is text? Text is present in every major business process, from support tickets, to product feedback, and online customer interactions. Automated, real time text analysis can help you get a handle on all that data with a broad range of business applications and use cases. Maximize efficiency and reduce repetitive tasks that often have a high turnover impact. Better understand customer insights without having to sort through millions of social media posts, online reviews, and survey responses.

If you work in customer experience, product, marketing, or sales, there are a number of text analysis applications to automate processes and get real world insights. And best of all you don’t need any data science or engineering experience to do it.

Social Media Monitoring

Let's say you work for Uber and you want to know what users are saying about the brand. You've read some positive and negative feedback on Twitter and Facebook. But 500 million tweets are sent each day , and Uber has thousands of mentions on social media every month. Can you imagine analyzing all of them manually?

This is where sentiment analysis comes in to analyze the opinion of a given text. By analyzing your social media mentions with a sentiment analysis model , you can automatically categorize them into Positive , Neutral or Negative . Then run them through a topic analyzer to understand the subject of each text. By running aspect-based sentiment analysis , you can automatically pinpoint the reasons behind positive or negative mentions and get insights such as:

  • The top complaint about Uber on social media?
  • The success rate of Uber's customer service - are people happy or are annoyed with it?
  • What Uber users like about the service when they mention Uber in a positive way?

Now, let's say you've just added a new service to Uber. For example, Uber Eats. It's a crucial moment, and your company wants to know what people are saying about Uber Eats so that you can fix any glitches as soon as possible, and polish the best features. You can also use aspect-based sentiment analysis on your Facebook, Instagram and Twitter profiles for any Uber Eats mentions and discover things such as:

  • Are people happy with Uber Eats so far?
  • What is the most urgent issue to fix?
  • How can we incorporate positive stories into our marketing and PR communication?

Not only can you use text analysis to keep tabs on your brand's social media mentions, but you can also use it to monitor your competitors' mentions as well. Is a client complaining about a competitor's service? That gives you a chance to attract potential customers and show them how much better your brand is.

Brand Monitoring

Follow comments about your brand in real time wherever they may appear (social media, forums, blogs, review sites, etc.). You’ll know when something negative arises right away and be able to use positive comments to your advantage.

The power of negative reviews is quite strong: 40% of consumers are put off from buying if a business has negative reviews. An angry customer complaining about poor customer service can spread like wildfire within minutes: a friend shares it, then another, then another… And before you know it, the negative comments have gone viral.

  • Understand how your brand reputation evolves over time.
  • Compare your brand reputation to your competitor's.
  • Identify which aspects are damaging your reputation.
  • Pinpoint which elements are boosting your brand reputation on online media.
  • Identify potential PR crises so you can deal with them ASAP.
  • Tune into data from a specific moment, like the day of a new product launch or IPO filing. Just run a sentiment analysis on social media and press mentions on that day, to find out what people said about your brand.
  • Repost positive mentions of your brand to get the word out.

Customer Service

Despite many people's fears and expectations, text analysis doesn't mean that customer service will be entirely machine-powered. It just means that businesses can streamline processes so that teams can spend more time solving problems that require human interaction. That way businesses will be able to increase retention, given that 89 percent of customers change brands because of poor customer service. But, how can text analysis assist your company's customer service?

Ticket Tagging

Let machines do the work for you. Text analysis automatically identifies topics, and tags each ticket. Here's how it works:

  • The model analyzes the language and expressions a customer language, for example, “I didn't get the right order.”
  • Then, it compares it to other similar conversations.
  • Finally, it finds a match and tags the ticket automatically. In this case, it could be under a Shipping Problems tag.

This happens automatically, whenever a new ticket comes in, freeing customer agents to focus on more important tasks.

Ticket Routing & Triage: Find the Right Person for the Job

Machine learning can read a ticket for subject or urgency, and automatically route it to the appropriate department or employee .

For example, for a SaaS company that receives a customer ticket asking for a refund, the text mining system will identify which team usually handles billing issues and send the ticket to them. If a ticket says something like “How can I integrate your API with python?” , it would go straight to the team in charge of helping with Integrations.

Ticket Analytics: Learn More From Your Customers

What is commonly assessed to determine the performance of a customer service team? Common KPIs are first response time , average time to resolution (i.e. how long it takes your team to resolve issues), and customer satisfaction (CSAT). And, let's face it, overall client satisfaction has a lot to do with the first two metrics.

But how do we get actual CSAT insights from customer conversations? How can we identify if a customer is happy with the way an issue was solved? Or if they have expressed frustration with the handling of the issue?

In this situation, aspect-based sentiment analysis could be used. This type of text analysis delves into the feelings and topics behind the words on different support channels, such as support tickets, chat conversations, emails, and CSAT surveys. A text analysis model can understand words or expressions to define the support interaction as Positive , Negative , or Neutral , understand what was mentioned (e.g. Service or UI/UX ), and even determine the sentiments behind the words (e.g. Sadness , Anger , etc.).

Urgency Detection: Prioritize Urgent Tickets

“Where do I start?” is a question most customer service representatives often ask themselves. Urgency is definitely a good starting point, but how do we define the level of urgency without wasting valuable time deliberating?

Text mining software can define the urgency level of a customer ticket and tag it accordingly. Support tickets with words and expressions that denote urgency, such as 'as soon as possible' or 'right away' , are duly tagged as Priority .

To see how text analysis works to detect urgency, check out this MonkeyLearn urgency detection demo model .

Voice of Customer (VoC) & Customer Feedback

Once you get a customer, retention is key, since acquiring new clients is five to 25 times more expensive than retaining the ones you already have. That's why paying close attention to the voice of the customer can give your company a clear picture of the level of client satisfaction and, consequently, of client retention. Also, it can give you actionable insights to prioritize the product roadmap from a customer's perspective.

Analyzing NPS Responses

Maybe your brand already has a customer satisfaction survey in place, the most common one being the Net Promoter Score (NPS). This survey asks the question, 'How likely is it that you would recommend [brand] to a friend or colleague?' . The answer is a score from 0-10 and the result is divided into three groups: the promoters , the passives , and the detractors .

But here comes the tricky part: there's an open-ended follow-up question at the end 'Why did you choose X score?' The answer can provide your company with invaluable insights. Without the text, you're left guessing what went wrong. And, now, with text analysis, you no longer have to read through these open-ended responses manually.

You can do what did: extract the main keywords of your customers' feedback to understand what's being praised or criticized about your product. Is the keyword 'Product' mentioned mostly by promoters or detractors? With this info, you'll be able to use your time to get the most out of NPS responses and start taking action.

Another option is following in Retently's footsteps using text analysis to classify your feedback into different topics, such as Customer Support, Product Design, and Product Features, then analyze each tag with sentiment analysis to see how positively or negatively clients feel about each topic. Now they know they're on the right track with product design, but still have to work on product features.

Analyzing Customer Surveys

Does your company have another customer survey system? If it's a scoring system or closed-ended questions, it'll be a piece of cake to analyze the responses: just crunch the numbers.

However, if you have an open-text survey, whether it's provided via email or it's an online form, you can stop manually tagging every single response by letting text analysis do the job for you. Besides saving time, you can also have consistent tagging criteria without errors, 24/7.

Business Intelligence

Data analysis is at the core of every business intelligence operation. Now, what can a company do to understand, for instance, sales trends and performance over time? With numeric data, a BI team can identify what's happening (such as sales of X are decreasing) – but not why . Numbers are easy to analyze, but they are also somewhat limited. Text data, on the other hand, is the most widespread format of business information and can provide your organization with valuable insight into your operations. Text analysis with machine learning can automatically analyze this data for immediate insights.

For example, you can run keyword extraction and sentiment analysis on your social media mentions to understand what people are complaining about regarding your brand.

You can also run aspect-based sentiment analysis on customer reviews that mention poor customer experiences. After all, 67% of consumers list bad customer experience as one of the primary reasons for churning. Maybe it's bad support, a faulty feature, unexpected downtime, or a sudden price change. Analyzing customer feedback can shed a light on the details, and the team can take action accordingly.

And what about your competitors? What are their reviews saying? Run them through your text analysis model and see what they're doing right and wrong and improve your own decision-making.

Sales and Marketing

Prospecting is the most difficult part of the sales process. And it's getting harder and harder. The sales team always want to close deals, which requires making the sales process more efficient. But 27% of sales agents are spending over an hour a day on data entry work instead of selling, meaning critical time is lost to administrative work and not closing deals.

Text analysis takes the heavy lifting out of manual sales tasks, including:

  • Updating the deal status as 'Not interested' in your CRM.
  • Qualifying your leads based on company descriptions.
  • Identifying leads on social media that express buying intent.

GlassDollar , a company that links founders to potential investors, is using text analysis to find the best quality matches. How? They use text analysis to classify companies using their company descriptions. The results? They saved themselves days of manual work, and predictions were 90% accurate after training a text classification model. You can learn more about their experience with MonkeyLearn here .

Not only can text analysis automate manual and tedious tasks, but it can also improve your analytics to make the sales and marketing funnels more efficient. For example, you can automatically analyze the responses from your sales emails and conversations to understand, let's say, a drop in sales:

  • What are the blocks to completing a deal?
  • What sparks a customer's interest?
  • What are customer concerns?

Now, Imagine that your sales team's goal is to target a new segment for your SaaS: people over 40. The first impression is that they don't like the product, but why ? Just filter through that age group's sales conversations and run them on your text analysis model. Sales teams could make better decisions using in-depth text analysis on customer conversations.

Finally, you can use machine learning and text analysis to provide a better experience overall within your sales process. For example, Drift , a marketing conversational platform, integrated MonkeyLearn API to allow recipients to automatically opt out of sales emails based on how they reply.

It's time to boost sales and stop wasting valuable time with leads that don't go anywhere. Xeneta, a sea freight company, developed a machine learning algorithm and trained it to identify which companies were potential customers, based on the company descriptions gathered through FullContact (a SaaS company that has descriptions of millions of companies).

You can do the same or target users that visit your website to:

  • Get information about where potential customers work using a service like Clearbit and classify the company according to its type of business to see if it's a possible lead.
  • Extract information to easily learn the user's job position, the company they work for, its type of business and other relevant information.
  • Hone in on the most qualified leads and save time actually looking for them: sales reps will receive the information automatically and start targeting the potential customers right away.

Product Analytics

Let's imagine your startup has an app on the Google Play store. You're receiving some unusually negative comments. What's going on?

You can find out what’s happening in just minutes by using a text analysis model that groups reviews into different tags like Ease of Use and Integrations. Then run them through a sentiment analysis model to find out whether customers are talking about products positively or negatively. Finally, graphs and reports can be created to visualize and prioritize product problems with MonkeyLearn Studio .

We did this with reviews for Slack from the product review site Capterra and got some pretty interesting insights . Here's how:

We analyzed reviews with aspect-based sentiment analysis and categorized them into main topics and sentiment.

We extracted keywords with the keyword extractor to get some insights into why reviews that are tagged under 'Performance-Quality-Reliability' tend to be negative.

Text Analysis Resources

Text Analysis Resources

There are a number of valuable resources out there to help you get started with all that text analysis has to offer.

Text Analysis APIs

You can use open-source libraries or SaaS APIs to build a text analysis solution that fits your needs. Open-source libraries require a lot of time and technical know-how, while SaaS tools can often be put to work right away and require little to no coding experience.

Open Source Libraries

Python is the most widely-used language in scientific computing, period. Tools like NumPy and SciPy have established it as a fast, dynamic language that calls C and Fortran libraries where performance is needed.

These things, combined with a thriving community and a diverse set of libraries to implement natural language processing (NLP) models has made Python one of the most preferred programming languages for doing text analysis.

NLTK , the Natural Language Toolkit, is a best-of-class library for text analysis tasks. NLTK is used in many university courses, so there's plenty of code written with it and no shortage of users familiar with both the library and the theory of NLP who can help answer your questions.

SpaCy is an industrial-strength statistical NLP library. Aside from the usual features, it adds deep learning integration and convolutional neural network models for multiple languages.

Unlike NLTK, which is a research library, SpaCy aims to be a battle-tested, production-grade library for text analysis.


Scikit-learn is a complete and mature machine learning toolkit for Python built on top of NumPy, SciPy, and matplotlib, which gives it stellar performance and flexibility for building text analysis models.

Developed by Google, TensorFlow is by far the most widely used library for distributed deep learning. Looking at this graph we can see that TensorFlow is ahead of the competition:

Tensorflow adoption

PyTorch is a deep learning platform built by Facebook and aimed specifically at deep learning. PyTorch is a Python-centric library, which allows you to define much of your neural network architecture in terms of Python code, and only internally deals with lower-level high-performance code.

Keras is a widely-used deep learning library written in Python. It's designed to enable rapid iteration and experimentation with deep neural networks, and as a Python library, it's uniquely user-friendly.

An important feature of Keras is that it provides what is essentially an abstract interface to deep neural networks. The actual networks can run on top of Tensorflow, Theano, or other backends. This backend independence makes Keras an attractive option in terms of its long-term viability.

The permissive MIT license makes it attractive to businesses looking to develop proprietary models.

R is the pre-eminent language for any statistical task. Its collection of libraries (13,711 at the time of writing on CRAN far surpasses any other programming language capabilities for statistical computing and is larger than many other ecosystems. In short, if you choose to use R for anything statistics-related, you won't find yourself in a situation where you have to reinvent the wheel, let alone the whole stack.

Caret is an R package designed to build complete machine learning pipelines, with tools for everything from data ingestion and preprocessing, feature selection, and tuning your model automatically.

The Machine Learning in R project (mlr for short) provides a complete machine learning toolkit for the R programming language that's frequently used for text analysis.

Java needs no introduction. The language boasts an impressive ecosystem that stretches beyond Java itself and includes the libraries of other The JVM languages such as The Scala and Clojure . Beyond that, the JVM is battle-tested and has had thousands of person-years of development and performance tuning, so Java is likely to give you best-of-class performance for all your text analysis NLP work.

Stanford's CoreNLP project provides a battle-tested, actively maintained NLP toolkit. While it's written in Java, it has APIs for all major languages, including Python, R, and Go.

The Apache OpenNLP project is another machine learning toolkit for NLP. It can be used from any language on the JVM platform.

Weka is a GPL-licensed Java library for machine learning, developed at the University of Waikato in New Zealand. In addition to a comprehensive collection of machine learning APIs, Weka has a graphical user interface called the Explorer , which allows users to interactively develop and study their models.

Weka supports extracting data from SQL databases directly, as well as deep learning through the deeplearning4j framework.

Using a SaaS API for text analysis has a lot of advantages:

Most SaaS tools are simple plug-and-play solutions with no libraries to install and no new infrastructure.

SaaS APIs provide ready to use solutions. You give them data and they return the analysis. Every other concern – performance, scalability, logging, architecture, tools, etc. – is offloaded to the party responsible for maintaining the API.

You often just need to write a few lines of code to call the API and get the results back.

  • Easy Integration:

SaaS APIs usually provide ready-made integrations with tools you may already use. This will allow you to build a truly no-code solution. Learn how to integrate text analysis with Google Sheets .

Some of the most well-known SaaS solutions and APIs for text analysis include:

  • MonkeyLearn
  • Google Cloud NLP
  • MeaningCloud
  • Amazon Comprehend

There is an ongoing Build vs. Buy Debate when it comes to text analysis applications: build your own tool with open-source software, or use a SaaS text analysis tool?

Building your own software from scratch can be effective and rewarding if you have years of data science and engineering experience, but it’s time-consuming and can cost in the hundreds of thousands of dollars.

SaaS tools, on the other hand, are a great way to dive right in. They can be straightforward, easy to use, and just as powerful as building your own model from scratch. MonkeyLearn is a SaaS text analysis platform with dozens of pre-trained models. Or you can customize your own, often in only a few steps for results that are just as accurate. All with no coding experience necessary.

Training Datasets

If you talk to any data science professional, they'll tell you that the true bottleneck to building better models is not new and better algorithms, but more data.

Indeed, in machine learning data is king: a simple model, given tons of data, is likely to outperform one that uses every trick in the book to turn every bit of training data into a meaningful response.

So, here are some high-quality datasets you can use to get started:

Topic Classification

Reuters news dataset : one the most popular datasets for text classification; it has thousands of articles from Reuters tagged with 135 categories according to their topics, such as Politics, Economics, Sports, and Business.

20 Newsgroups : a very well-known dataset that has more than 20k documents across 20 different topics.

Product reviews : a dataset with millions of customer reviews from products on Amazon.

Twitter airline sentiment on Kaggle : another widely used dataset for getting started with sentiment analysis. It contains more than 15k tweets about airlines (tagged as positive, neutral, or negative).

First GOP Debate Twitter Sentiment : another useful dataset with more than 14,000 labeled tweets (positive, neutral, and negative) from the first GOP debate in 2016.

Other Popular Datasets

Spambase : this dataset contains 4,601 emails tagged as spam and not spam.

SMS Spam Collection : another dataset for spam detection. It has more than 5k SMS messages tagged as spam and not spam.

Hate speech and offensive language : a dataset with more than 24k tagged tweets grouped into three tags: clean, hate speech, and offensive language.

Finding high-volume and high-quality training datasets are the most important part of text analysis, more important than the choice of the programming language or tools for creating the models. Remember, the best-architected machine-learning pipeline is worthless if its models are backed by unsound data.

Text Analysis Tutorials

The best way to learn is by doing.

First, we'll go through programming-language-specific tutorials using open-source tools for text analysis. These will help you deepen your understanding of the available tools for your platform of choice.

Then, we'll take a step-by-step tutorial of MonkeyLearn so you can get started with text analysis right away.

Tutorials Using Open Source Libraries

In this section, we'll look at various tutorials for text analysis in the main programming languages for machine learning that we listed above.

The official NLTK book is a complete resource that teaches you NLTK from beginning to end. In addition, the reference documentation is a useful resource to consult during development.

Other useful tutorials include:

WordNet with NLTK: Finding Synonyms for words in Python : this tutorial shows you how to build a thesaurus using Python and WordNet .

Tokenizing Words and Sentences with NLTK : this tutorial shows you how to use NLTK's language models to tokenize words and sentences.

spaCy 101: Everything you need to know : part of the official documentation, this tutorial shows you everything you need to know to get started using SpaCy.

This tutorial shows you how to build a WordNet pipeline with SpaCy.

Furthermore, there's the official API documentation , which explains the architecture and API of SpaCy.

If you prefer long-form text, there are a number of books about or featuring SpaCy:

  • Introduction to Machine Learning with Python: A Guide for Data Scientists .
  • Practical Machine Learning with Python .
  • Text Analytics with Python .

The official scikit-learn documentation contains a number of tutorials on the basic usage of scikit-learn, building pipelines, and evaluating estimators.

Scikit-learn Tutorial: Machine Learning in Python shows you how to use scikit-learn and Pandas to explore a dataset, visualize it, and train a model.

For readers who prefer books, there are a couple of choices:

Our very own Raúl Garreta wrote this book: Learning scikit-learn: Machine Learning in Python .

Additionally, the book Hands-On Machine Learning with Scikit-Learn and TensorFlow introduces the use of scikit-learn in a deep learning context.

The official Keras website has extensive API as well as tutorial documentation. For readers who prefer long-form text, the Deep Learning with Keras book is the go-to resource. The book uses real-world examples to give you a strong grasp of Keras.

Other tutorials:

Practical Text Classification With Python and Keras : this tutorial implements a sentiment analysis model using Keras, and teaches you how to train, evaluate, and improve that model.

Text Classification in Keras : this article builds a simple text classifier on the Reuters news dataset. It classifies the text of an article into a number of categories such as sports, entertainment, and technology.

TensorFlow Tutorial For Beginners introduces the mathematics behind TensorFlow and includes code examples that run in the browser, ideal for exploration and learning. The goal of the tutorial is to classify street signs.

The book Hands-On Machine Learning with Scikit-Learn and TensorFlow helps you build an intuitive understanding of machine learning using TensorFlow and scikit-learn.

Finally, there's the official Get Started with TensorFlow guide.

The official Get Started Guide from PyTorch shows you the basics of PyTorch. If you're interested in something more practical, check out this chatbot tutorial ; it shows you how to build a chatbot using PyTorch.

The Deep Learning for NLP with PyTorch tutorial is a gentle introduction to the ideas behind deep learning and how they are applied in PyTorch.

Finally, the official API reference explains the functioning of each individual component.

A Short Introduction to the Caret Package shows you how to train and visualize a simple model. A Practical Guide to Machine Learning in R shows you how to prepare data, build and train a model, and evaluate its results. Finally, you have the official documentation which is super useful to get started with Caret.

For those who prefer long-form text, on arXiv we can find an extensive mlr tutorial paper . This is closer to a book than a paper and has extensive and thorough code samples for using mlr.

If interested in learning about CoreNLP, you should check out's tutorial which explains how to quickly get started and perform a number of simple NLP tasks from the command line. Moreover, this CloudAcademy tutorial shows you how to use CoreNLP and visualize its results. You can also check out this tutorial specifically about sentiment analysis with CoreNLP . Finally, there's this tutorial on using CoreNLP with Python that is useful to get started with this framework.

First things first: the official Apache OpenNLP Manual should be the starting point. The book Taming Text was written by an OpenNLP developer and uses the framework to show the reader how to implement text analysis. Moreover, this tutorial takes you on a complete tour of OpenNLP, including tokenization, part of speech tagging, parsing sentences, and chunking.

The Weka library has an official book Data Mining: Practical Machine Learning Tools and Techniques that comes handy for getting your feet wet with Weka.

If you prefer videos to text, there are also a number of MOOCs using Weka:

Data Mining with Weka : this is an introductory course to Weka.

More Data Mining with Weka : this course involves larger datasets and a more complete text analysis workflow.

Advanced Data Mining with Weka : this course focuses on packages that extend Weka's functionality.

The Text Mining in WEKA Cookbook provides text-mining-specific instructions for using Weka.

How to Run Your First Classifier in Weka : shows you how to install Weka, run it, run a classifier on a sample dataset, and visualize its results.

Text Analysis Tutorial With MonkeyLearn Templates

MonkeyLearn Templates is a simple and easy-to-use platform that you can use without adding a single line of code.

Follow the step-by-step tutorial below to see how you can run your data through text analysis tools and visualize the results: 

1. Choose a template to create your workflow:

Choose template.

2. Upload your data.

We chose the app review template, so we’re using a dataset of reviews.

Upload your data.

If you don't have a CSV file:

  • You can use our sample dataset .
  • Or, download your own survey responses from the survey tool you use with this documentation .

3. Match your data to the right fields in each column:

Match columns to fields.

  • created_at: Date that the response was sent.
  • text: Text of the response.
  • rating: Score given by the customer.

4. Name your workflow:

Name your workflow.

5. Wait for MonkeyLearn to process your data:

Wait for data to process.

6. Explore your dashboard!

Explore dashboard.

MonkeyLearn’s data visualization tools make it easy to understand your results in striking dashboards. Spot patterns, trends, and immediately actionable insights in broad strokes or minute detail.

  • Filter by topic, sentiment, keyword, or rating.
  • Share via email with other coworkers.

Text analysis is no longer an exclusive, technobabble topic for software engineers with machine learning experience. It has become a powerful tool that helps businesses across every industry gain useful, actionable insights from their text data. Saving time, automating tasks and increasing productivity has never been easier, allowing businesses to offload cumbersome tasks and help their teams provide a better service for their customers.

If you would like to give text analysis a go, sign up to MonkeyLearn for free and begin training your very own text classifiers and extractors – no coding needed thanks to our user-friendly interface and integrations.

And take a look at the MonkeyLearn Studio public dashboard to see what data visualization can do to see your results in broad strokes or super minute detail.

Reach out to our team if you have any doubts or questions about text analysis and machine learning, and we'll help you get started!


MonkeyLearn Inc. All rights reserved 2024

Free high-quality Text Analysis Tools

powered by Prose Analyzer

Text Structure Analysis

A versatile solution for SEO optimization, readability improvement, and comprehensive linguistic insights. Our all-in-one tool!

Keyword Analysis

Identifies and evaluates the significance and usage of specific keywords and phrases in your text, enhancing the focus and relevance of your writing.

Sentiment Analysis

Analyzes text to determine the underlying emotional tone, providing insights into the overall sentiment conveyed by the writing.

Paragraph Analysis

Evaluates paragraph structure and composition, ensuring each section contributes effectively to the overall message and readability of the text.

Sentence Analysis

Focuses on analyzing individual sentences for grammatical accuracy and clarity, aiding in the creation of concise and impactful writing.

Word Analysis

Examines word choice and frequency, offering insights into the effectiveness and diversity of language used in the text.

Elevate Your Writing with Prose Analyzer

Welcome to Prose Analyzer – your advanced online text analysis tool.

Whether you are a student polishing an essay, an educator guiding learners, a researcher handling vast amounts of data, or a content creator shaping impactful prose, Prose Analyzer is your comprehensive solution. Tailor-made for users with a diversity of needs, our tool brings unprecedented insights into the fabric of your written content.

When to use Prose Analyzer?

Unlock the full potential of Prose Analyzer across a spectrum of applications:

Word and Character Count:

Swiftly assess the length of your content with our powerful word counter and character counter, capable of handling text volumes from 100k to 1 million words.

Keyword Analysis:

Fine-tune your content with detailed keyword analytics, ensuring your message resonates effectively with your audience.

Paragraph and Sentence Averages:

Gain insights into your writing structure with averages for characters, words, and sentences per paragraph, as well as characters and words per sentence. Explore these features now to elevate your text analysis experience and boost your understanding of written content.

Reading and Speaking Time:

Estimate engagement and presentation durations at 230 words per minute and 150 words per minute, respectively.

Sentiment Analysis:

Dive into the emotional tone of your text with our intuitive pie chart, providing nuanced insights into global sentiment.

Who needs Prose Analyzer?

Prose Analyzer caters to a diverse audience, making it an indispensable tool for various purposes:

Elevate your academic writing with Prose Analyzer. Ensure your essays and assignments meet word count requirements while refining the structure for maximum impact. Prose Analyzer empowers you to submit polished, well-structured work, impressing both instructors and peers.


Efficiently navigate vast amounts of textual data with Prose Analyzer. Save time in the research process by gaining quick insights into word and character counts. Leverage features like paragraph and sentence averages, enhancing your ability to analyze and comprehend large volumes of text, ultimately boosting your research output.


Craft impactful resumes and cover letters with Prose Analyzer. Tailor your application materials to meet optimal word count standards while ensuring clarity and precision. Prose Analyzer empowers job-seekers to present themselves compellingly in writing, making a lasting impression on potential employers.

Digital Marketers:

Maximize the impact of your content strategy with Prose Analyzer. Analyze and refine your written material to resonate effectively with your target audience. Prose Analyzer's insights into word and keyword usage provide digital marketers with the tools to enhance the effectiveness of their campaigns and messaging.

Guide your students toward improved writing practices with Prose Analyzer. Assess their work with detailed text analysis, providing constructive feedback on word usage, structure, and overall composition. Prose Analyzer becomes an invaluable tool in fostering better writing habits and facilitating impactful learning experiences.

Refine your craft with Prose Analyzer's detailed insights. Gain a nuanced understanding of your writing structure, word usage, and overall composition. Whether you are a seasoned author or aspiring wordsmith, Prose Analyzer equips you with the tools to enhance the effectiveness and impact of your prose.

Social Media Users:

Optimize your social media posts for maximum engagement with Prose Analyzer. Understand the analytics behind your text, ensuring clarity and resonance with your audience. Whether you are a casual user or a social media influencer, Prose Analyzer helps you fine-tune your content for optimal impact.

Bloggers and Online Business Owners:

Improve the visibility and impact of your online content with Prose Analyzer. Understand the structure and keyword usage in your articles, blog posts, or product descriptions. Prose Analyzer is an essential tool for bloggers and online business owners looking to optimize their content for search engines and audience engagement.

How to use Prose Analyzer?

Prose Analyzer is user-friendly and efficient:

1. Paste or Type: Begin typing or paste your text effortlessly.

2. Live Analysis: Prose Analyzer provides real-time analysis as you type, offering instant insights into your content.

3. Review Results: Explore paragraph, sentence, word, and character counts, and delve into advanced analytics for a thorough understanding of your text.

How Prose Analyzer differs from others?

Prose Analyzer sets itself apart by offering more than just basic text analysis. Dive into the emotional tone of your content with our unique sentiment analysis feature. Visualize sentiments through an intuitive pie chart, providing nuanced insights into the global emotional context expressed in your text.

Versatile Analytics:

Prose Analyzer's power lies in its versatility, making it an indispensable tool for a diverse audience. From writers and students to professionals and content creators, our tool caters to various needs. Tailor your analysis to meet specific requirements, whether it is refining prose, meeting academic standards, or optimizing content for different platforms.

Detailed Averages:

Gain a deeper understanding of your writing structure with Prose Analyzer's detailed averages. Explore per-paragraph, per-sentence, and per-word averages, allowing you to scrutinize and refine your content with precision. This level of granularity empowers you to craft content that is not only well-structured but also impactful.

Intuitive Interface:

Prose Analyzer prioritizes user experience with its intuitive interface. Whether you are a novice or an experienced user, our tool ensures accessibility for all. Seamlessly navigate through its features, making text analysis a straightforward and enjoyable process.

This comprehensive feature set, coupled with an accessible and user-friendly design, positions Prose Analyzer as a standout tool for individuals seeking detailed insights into their written content.

Start Analyzing Your Prose Now!

Embark on a journey of text analysis with Prose Analyzer

Gain profound insights into your written content. Perfect for writers, students, professionals, and anyone seeking in-depth understanding of the nuances within their prose.

  • Open access
  • Published: 21 February 2024

Postexamination item analysis of undergraduate pediatric multiple-choice questions exam: implications for developing a validated question Bank

  • Nagwan I. Rashwan 1 ,
  • Soha R. Aref 2 ,
  • Omnia A. Nayel 3 &
  • Mennatallah H. Rizk   ORCID: 4  

BMC Medical Education volume  24 , Article number:  168 ( 2024 ) Cite this article

Metrics details


Item analysis (IA) is widely used to assess the quality of multiple-choice questions (MCQs). The objective of this study was to perform a comprehensive quantitative and qualitative item analysis of two types of MCQs: single best answer (SBA) and extended matching questions (EMQs) currently in use in the Final Pediatrics undergraduate exam.


A descriptive cross-sectional study was conducted. We analyzed 42 SBA and 4 EMQ administered to 247 fifth-year medical students. The exam was held at the Pediatrics Department, Qena Faculty of Medicine, Egypt, in the 2020–2021 academic year. Quantitative item analysis included item difficulty (P), discrimination (D), distractor efficiency (DE), and test reliability. Qualitative item analysis included evaluation of the levels of cognitive skills and conformity of test items with item writing guidelines.

The mean score was 55.04 ± 9.8 out of 81. Approximately 76.2% of SBA items assessed low cognitive skills, and 75% of EMQ items assessed higher-order cognitive skills. The proportions of items with an acceptable range of difficulty (0.3–0.7) on the SBA and EMQ were 23.80 and 16.67%, respectively. The proportions of SBA and EMQ with acceptable ranges of discrimination (> 0.2) were 83.3 and 75%, respectively. The reliability coefficient (KR20) of the test was 0.84.

Our study will help medical teachers identify the quality of SBA and EMQ, which should be included to develop a validated question bank, as well as questions that need revision and remediation for subsequent use.

Peer Review reports

“Assessment affects students learning in at least four ways: its content, format, timing, and any subsequent feedback given to the medical students” [ 1 ]. MCQs are a well-established format for undergraduate medical student assessment, given that MCQs allow broad coverage of learning objectives. In addition, MCQs are objective and scored easily and quickly with minimal human-related errors or bias. Well-designed MCQs allow for the assessment of higher cognitive skills rather than low cognitive skills [ 2 ].

However, MCQs have some limitations. Construction of MCQs is most difficult and time-consuming even for well-trained staff members. There is evidence that the basic item-writing principles are not followed mostly when constructing MCQs. The presence of flawed MCQs can interfere with the accurate and meaningful interpretation of test scores and negatively affect student pass rates. Therefore, to develop reliable and valid tests, items must be constructed that are free of such flaws [ 3 ].

Item analysis (IA) is the set of qualitative and quantitative procedures used to evaluate the characteristics of items of the test before and after test development and construction. Quantitative item analysis uses statistical methods to help make judgments about which items need to be kept, reviewed, or discarded. Qualitative item analysis depends on the judgment of the reviewers about whether guidelines for item writing are followed or not [ 4 ].

In quantitative IA, three psychometric domains are assessed for each item: item difficulty (P), item discrimination (D), and distractor efficiency (DE) [ 5 ]. Item difficulty (P) refers to the proportion of students who correctly answered the item. It ranges from (0–1) [ 6 ]. Item discrimination (D) indicates the extent to which the item can differentiate between higher- and lower-achieving students. It ranges between − 1.0 (perfect negative discrimination) to + 1.0 (perfect positive discrimination) [ 6 ]. An item discrimination of more than 0.2 was reported as evidence of item validity. Any item with less than 0.2 or negative discrimination should be reviewed or discarded [ 7 , 8 ]. Distractor efficiency (DE) is determined for each item based on the number of nonfunctioning distractors (NFDs) (option selected by < 5% of students) within it [ 9 ].

Qualitative IA should be routinely performed before and after the exam to review test items’ conformity with MCQ construction guidelines. The two most common threats to the quality of multiple-choice questions are item writing flaws (IWFs) and testing of lower cognitive function [ 10 ]. Item writing flaws are violations of MCQ construction guidelines meant to prevent testwiseness and irrelevant difficulty from influencing medical students’ performance on multiple-choice exams. IWFs can either introduce unnecessary difficulty unrelated to the intended learning outcomes or provide cues that enable testwise students to guess the correct answer without necessarily understanding the content. Both types of flaws can skew the final test scores and compromise the validity of the assessment [ 8 , 11 ]. Well-constructed MCQs allow the evaluation of high-order cognitive skills such as the application of knowledge, interpretation, or synthesis rather than testing lower cognitive skills. On the other hand, MCQs were mostly used to test lower rather than higher cognitive skills, which can be considered a significant threat to the quality of multiple-choice questions [ 12 ]. In many medical schools, faculty members are not sufficiently trained to construct MCQs that examine high cognitive skills linked to authentic professional situations [ 13 ].

This study aimed to perform a postexamination quantitative and qualitative item analysis of two types of MCQs, SBA and EMQ, to provide guidance when making decisions regarding keeping, reviewing, or discarding questions from exams or question banks.


Data were collected from the pediatric summative exam of Pediatrics course (PED502, a 7-credit-hour course), which was conducted at the Qena Faculty of Medicine, South Valley University, Qena, Egypt. The medical school implements a ‘6 + 1’ medical curriculum. This is a comprehensive seven-year educational program that includes 6 years of foundational and clinical medical education, followed by a year of practical training or internship. Qena Faculty of Medicine, South Valley University has been officially accredited by the National Authority for Quality Assurance and Accreditation of Education (NAQAAE) in 2021 ( ). Approximately 247 medical students in their fifth year were qualified to take the pediatric final exam during the second semester of the 2020–2021 academic year. All exam questions were authored by Pediatrics department, Qena Faculty of Medicine, South Valley University faculty members, intended to have one correct response.

The exam papers and relevant SBA and EMQ item analysis reports were collected and reviewed. Outputs of Remark Classic OMR® (MCQ test item analysis software) were used for scanning and analyzing data from the exam. It automates the process of collecting and analyzing data from “fill in the bubble” forms. The information collected were the following: test item analysis report; number of questions graded, students’ responses (correct, incorrect, no response), item difficulty (P), item discrimination (D), and distractor efficiency (DE). The qualitative item analysis was determined by three assessors. They were provided with MCQ qualitative analysis checklist to review the exam (Additional file 1 ). Two types of multiple-choice questions (MCQ) were used in this exam; Single Best Answer (SBA) and Extended Matching Questions (EMQs). SBA items were 42 with five options, and the EMQs were four sets with three stems in each set and eight options for each set. The correct response was awarded one and half mark and the incorrect response given zero mark. Each SBA and EMQ were analyzed independently by three assessors as to its level of cognitive skill test and presence of item writing flaws. Assessors had content-area expertise, experience preparing multiple choice exam. Questions were categorized according to modified Bloom’s taxonomy: Level I Knowledge (recall of information), Level II Comprehension and Application (ability to interpret data). Level III Problem solving (Use of knowledge and understanding in new circumstances) [ 14 ]. Cohen’s κ was run to determine the inter-rater reliability for the three assessors which was found to be substantial, with a Kappa coefficient of 0.591 ( p  <  0.001). This indicates that there is a significant level of agreement between the assessors beyond what would be expected by chance according to the guidelines proposed by Landis and Koch (1977) [ 15 ].

SBA item writing flaws (IWFs) were retrieved from NBME item writing guide (6th edition, 2020) [ 11 ]. IWFs were categorized and scored as stem flaws (1 = negatively phrased stem, 2 = logical/grammatical cue, 3 = vague, unclear term, 4 = tricky, unnecessarily complicated stems, 5 = no led in question/defective, 6 = poorly constructed, short). Option Flaws (1 = Long, complex options, 2 = Inconsistent use of numeric data, 3= “None of the above” option, 4 = Nonhomogeneous options, 5 = Collectively exhaustive options, 6 = Absolute terms, 7 = Grammatical/logical clues, 8 = Correct answer stands out, 9 = Word repeats (clang clue), 10 = Convergence). EMQ IWFs were retrieved from Case and Swanson (1993) work that highlighted the characteristics of well written EMQs [ 16 ]. EMQ IWFs were categorized and scored into: Options Flaws (1 = options less than 6/more than 25, 2 = not focused, 3 = no logical/alphabetical order, 4 = not homogenous, 5 = overlapping/complex), Led in Question Flaws (1 = not clear/focused, 2 = nonspecific), and Stem Flaws (1 = non-vignette, 2 = not Clear/Focused vignette, 3 = short, poorly constructed).

Data analysis

Descriptive methods are based on Classical Test Theory (CTT). The CTT considers reliability, difficulty, discrimination, and the distractor efficiency to check the appropriateness and plausibility of all distractors. The core of this theory is based on the functions of the true test score and the error of random measurement [ 17 ]. Item psychometric parameters were collected from reported examination statistics including item difficulty (P), item discrimination (D), distractor efficiency (DE) and internal consistency reliability for the whole test. The criteria for classification of item difficulty are as follows: P  <  0.3 (too difficult), P between 0.3 and 0.7 (good/acceptable/average), P  > 0.7 (too easy) and item difficulty between 0.5 and 0.6 (excellent/ideal). The criteria for classification of the item discrimination are as follows: D ≤ 0.20 (poor), 0.21 to 0.39 (good) and D ≥ 0.4 (excellent). The items were categorized on the basis of numbers of NFDs in SBA and EMQ, that is, if a five-option SBA includes 4-NFD, 3-NFD, 2-NFD, 1-NFD, or 0-NFD, the corresponding distractor efficiency (DE) is 0.00, 25, 50, 75 and 100%, respectively. In an EMQ, if the options include 7-NFD, 6-NFD, 5-NFD, 4-NFD, 3-NFD, 2-NFD, 1-NFD, or 0-NFD, the corresponding distractor efficiency (DE) is 0.00, 14.30, 28.50, 42.80, 57.10, 71.40, 85.70, and 100.00%, respectively.

Test reliability

Reliability refers to how consistent the results from the test are. The Kuder and Richardson method KR-20 is a measure of reliability for a test with binary variables (i.e. answers that are right or wrong). K-R20 is used to estimate the extent to which performance on an item relates to the overall test scores. In this study, K-R20 was used to estimate the reliability of the pediatric final exam. A single test was used hence the reliability method rest in the internal consistency methods . The scores for KR-20 range from 0 to 1, where 0 is no reliability and 1 is perfect reliability. The value of KR-20 between 0.7 and 0.9 falls in good range. Reliability estimates can be applied in numerous ways in assessment. A practical application of the reliability coefficient is to compute the Standard Error of Measurement (SEM). The SEM is calculated for the full range of scores on an evaluation using a specific formula, SEM  =  Standard deviation  ×  √ (1 −  Reliability ). This SEM can be utilized to create confidence intervals around the observed assessment score, which signifies the accuracy of the measurement, considering the reliability of the evaluation, for each scoring level. This estimate aids assessors in determining how an individual’s observed test score and true score differ [ 18 ].

Basic frequency distributions and descriptive statistics were computed for all variables. Normality assumption testing involved the use of Q-Q plots, frequency histograms (with normal curve overlaid) and Shapiro-Wilks Test of Normality. This testing found that Normality was met for all analyses except one variable (difficulty level of SBA). This variable was subjected to a two-step normalization process to achieve a normal distribution, as per the method outlined by Templeton, Gary F. (2011). This approach ensured a more accurate analysis of the data [ 19 ].

Parametric significance test, specifically the independent t-test, was used to compare the means of difficulty and discrimination, for SBA and EMQ formats. The independent t-test allowed us to determine if there were statistically significant differences in difficulty and discrimination between the SBA and EMQ formats. All analyses were conducted as two-tailed, with p  = .05 used as the threshold for statistical significance, using SPSS Statistics for Windows, Version 24 (IBM Corp.). Figure was generated in Microsoft Excel 2013.

The final pediatrics exam was composed of 54 items, and the total score was 81 (1.5 mark for each question). The mean exam score was 55.04 ± 9.82. The value of KR-20 was 0.86. This is considered acceptable as it is greater than the commonly accepted threshold of 0.7 for acceptable reliability. This suggests that the MCQ exam in this study is a reliable tool for assessment [ 20 ]. Reliability depends both on Standard Error of Measurement (SEM) and on the ability range (standard deviation, SD) of students taking an assessment. The standard error of measurement (SEM) was 3.91. The smaller the SEM, the more accurate are the assessments that are being made [ 21 ].

Quantitative item analysis

The difficulty level of items was easy ( P  > 0.7), at 61.9% of SBA and 66.69% of EMQ. The difficulty level of items was moderate (0.7 ≥  P  > 0.3) at 23.8% of SBA and 16.67% of EMQ. However, the difficulty level was difficult ( P  ≤ 0.3), at 14.3% for SBA and 16.67% for EMQ. Item discrimination was > 0.2 at 83.3% of SBA and 75% of EMQ, indicating good discriminating items. Three SBAs (7.10%) had poor discrimination (D ≤ 0.2). Four SBAs (9.5%) had negative discrimination. The mean DE of SBA was 37.69% ± 33.12. The percentage of functioning distractors was 36.9%, and the percentage of nonfunctioning distractors was 63.1%. Only 11.90% of SBA had distractor efficiency (100.00%), while 26.20% had distractor efficiency 0.00%. The mean DE for EMQ was 13.09 ± 15.46. No EMQ had a DE of 100%, while EMQ with DE (0.00%) was 41.60%. The percentage of functioning distractors was 13.1%, while the percentage of nonfunctioning distractors was 86.90%.

Table 1 indicates that only 9 SBAs (21.4%) met the recommended levels for difficulty and discrimination (with P ranging from 0.3 to 0.7 and D > 0.2). Table 2 indicates that only 2 EMQ items (16.7%) met the recommended levels for difficulty and discrimination (with P ranging from 0.3 to 0.7 and D > 0.2). These questions should be retained in the question bank, provided they are free of IWFs.

Table 3 shows the comparative analysis the mean difficulty (P) and discrimination (D) values of Single Best Answer (SBA) and Extended Matching Questions (EMQ), the following findings were observed. The mean difficulty for SBA was 0.67 (±0.28) and for EMQ was 0.70 (±0.28). The independent t-test showed no significant difference between the two formats (t = − 0.405, p  = 0.686). Similarly, the mean discrimination for SBA was 0.32 (±0.16) and for EMQ was 0.35 (±0.18). Again, the independent t-test revealed no significant difference (t = − 0.557, p  = 0.620).

Qualitative item analysis

The prevalence of SBA testing low cognitive skills was 76.19%. Only 23.8% of SBAs tested higher cognitive skills. Conversely, most EMQs tested higher cognitive skills (75%), and 25% of EMQs tested low cognitive skills.

The frequency of flawed SBAs with stem flaws was 30 questions (71.40%). Option flaws were found in 23 questions (54.76%). SBA with more than 2 IWFs comprised 15 questions (35.7%). Poorly constructed stems were the most frequent stem flaw, with 15 questions (35.70%), followed by negatively phrased stems (33.30%), vague, unclear terms (21.40%), tricky unnecessarily complicated stems (21.40%), no lead-in question (21.40%) and logical/grammatical cue flaws (7.10%). Regarding the flaws related to options , the nonhomogeneous options list was the most frequent flaw (35.70%). The correct answer stands out, with long complex options and inconsistent use of numeric data (9.50%) each. Word repeats and convergence were found in 4.80% of cases each.

EMQs with option flaws with no logical/alphabetical order were the most frequent (100.00%), nonhomogenous, and overlapping/complex (25.00%) for each. Lead-in statement flaws included unclear/unfocused lead-in statements (75.00%) and nonspecific statements (50.00%). The stem flaws found were nonvignette and short poorly constructed stems (25.00%) for each.

Figure  1 shows the four categories of MCQ based on the level of IA indices (P and D) and the presence or absence of IWFs. The four categories are as follows:

Acceptable IA indices with no IWFs: Questions have difficulty level within the acceptable range (0.3–0.7) and discrimination level > 0.2, and items are free of flaws.

Acceptable IA indices with IWFs: Questions have acceptable difficulty and discrimination levels, and items are flawed.

Nonacceptable IA indices with no IWFs: Questions have difficulty level < 0.3 or more than > 0.7 and discrimination less than < 0.2, and items are free of flaws.

Nonacceptable IA indices with IWFs: Questions have difficulty level < 0.3 or more than > 0.7 and discrimination less than < 0.2, and items are flawed.

figure 1

The four categories of both SBA and EMQ formats based on: the level of item analysis (IA) indices (difficulty P and discrimination D) and the presence or absence of IWFs

The prevalence of SBA and EMQ with acceptable IA indices with no IWFs was 14.2 and 0%, respectively. Those previous questions should be kept in the questions bank without any modifications. However, the prevalence of SBA and EMQ with acceptable IA indices with IWFs was 23.8 and 33.3%, respectively. These questions need remediations before being kept in the question bank. Items with nonacceptable IA indices with or without IWFs (which constitute more than 60% of the items) should be discarded from the question bank.

In this study, we performed both quantitative and qualitative postexamination item analysis of the summative undergraduate pediatrics MCQ exam. The quantitative analysis discovered a range of item difficulty and discrimination levels, highlighting the importance of a diverse question bank in assessing a broad spectrum of student abilities. Qualitative item analysis, on the other hand, involves a more subjective review of each item. It helped to identify issues with cognitive level, item clarity, and writing flaws. The qualitative analysis complemented the quantitative findings and provided additional insights into the quality of the items. The findings underlined the value of both quantitative and qualitative item analysis in ensuring the validity and reliability of the exam and in building a robust question bank. Both quantitative and qualitative item analysis are crucial for making decisions about whether to keep, review, or remove questions from the test or question bank. These decisions enabled us to identify ideal questions and develop a valid and reliable question bank for future assessment that will enhance the quality of the assessment in undergraduate pediatrics.

An ideal MCQ is clear, focused, and relevant to the intended learning outcomes. It should have a single best answer and distractors that are plausible but incorrect. In addition, an ideal MCQ should have an appropriate level of difficulty and discrimination power. The findings from this study suggest that the proportion of ideal questions, as defined by the three criteria (difficulty level of 0.3–0.7, discrimination level > 0.2, and 100% distractor efficiency), is lower than what has been reported in previous studies. Specifically, only 4.7% of Single Best Answer (SBA) questions met these criteria, and none of the Extended Matching Questions (EMQs) did. This is in contrast to previous studies, which reported that 15–20% of MCQs fulfilled all three criteria [ 22 , 23 ]. These findings highlight the importance of rigorous question development and review processes to ensure the quality of MCQs. This could include strategies such as regular postexamination item analysis, peer review of questions, and ongoing training for question authors [ 24 , 25 ].

In this study, the mean P was higher for the EMQ than for the SBA, although the difference was not statistically significant (t = − 0.405, p  = 0.686). The mean D was higher for the EMQ than for the SBA, although the difference was not statistically significant difference (t = − 0.557, p  = 0.620). Therefore, both formats demonstrated comparable levels of difficulty and discrimination in the context of this study. This is in contrast to previous studies, which have reported significant differences in difficulty levels between these two formats. Increasing the number of options had an influence on difficulty levels as questions with more options were more difficult or harder [ 26 , 27 ]. This discrepancies could be explained by high number of non-functioning distractors (NFD) in Extended Matching Questions (EMQ) which had a significant impact on both the item difficulty and discrimination levels of the questions. Firstly, the presence of NFDs leads to easier questions. Therefore, a high number of NFDs can make it easier for examinees to identify the correct answer. Secondly, NFDs can also affect the item discrimination level. If a question has many NFDs, it may not effectively discriminate between higher- and lower-achieving students. In this study, the percentage of non-functioning distractors of EMQs was 86.90%. These findings underline the importance of careful distractor selection and review in the development of EMQs. By reducing the number of NFDs, it may be possible to increase the item difficulty and discrimination levels of the questions, thereby improving the overall quality of the assessment.

Distractor analysis of MCQs can enhance the quality of exam items. We can fix MCQ items by replacing or removing nonfunctioning distractors rather than eliminating the whole item, which would save more energy and time for future exams [ 24 ]. In both the SBA and the EMQ, we found a considerable number of nonfunctioning distractors (NFDs), 63.10 and 86.90%, respectively. We found that our faculty members need training for the construction of plausible distractors of MCQs to improve the quality of MCQ exams [ 28 ]. In addition, we should reduce the number of options to three-option items instead of five-option items [ 29 , 30 ]. Tarrant and Ware proved that three-option items perform equally well as four-option items and have suggested writing three-option items, as they require less time to be developed [ 31 ]. NFDs were more commonly encountered in EMQ than SBA. The EMQ had more options (8 compared to 5), so it may be more difficult to create plausible distractors that draw students to respond to them. All EMQ with many NFDs should be revised or even converted to SBA instead [ 32 ].

The reliability coefficient (KR20) of the test was 0.84, which shows an acceptable level of reliability. The standard error of measurement (SEM) was 3.91. SEM estimates the amount of error built into a test taker’s score. This estimate aids evaluators in determining how an individual’s observed test score and true score differ. The test reliability and the SEM are interconnected. The SEM decreases as the test reliability increases [ 5 ]. For a short test (fewer than 50 items), a KR20 of 0.7 is acceptable, while for a prolonged test (more than 50 items), a KR20 of 0.8 would be acceptable. Test reliability can be improved by the removal of flawed items or very easy or difficult items. Items with poor correlation should be revised or discarded from the test [ 7 ].

In our study, we analyzed the cognitive levels of SBA and EMQ based on modified Bloom’s taxonomy [ 14 ]. We found that 76.19% of SBA assessed low cognitive levels, while only 25% of EMQ assessed low cognitive skills. Conversely, 75% of EMQ assessed higher cognitive skills. These results are similar to other studies that found that 60.47 and 90% of MCQs were at low cognitive levels [ 13 ]. EMQs are recommended to be used in undergraduate medical examinations to test the higher cognitive skills of advanced medical students or in high-stakes examinations [ 33 ]. A mixed examination format including SBA and EMQ was the best examination to distinguish poor from moderate and excellent students [ 34 ].

In this study, we aimed to find common technical flaws in the MCQ Pediatrics exam. We found that only 26.20% of SBA questions followed all best practices of item writing construction guidelines. The prevalence of item writing flaws was 73.80% for SBA, and all EMQ sets were flawed. This high proportion of flawed items was similar to other studies, where approximately half of the analyzed items were considered flawed items [ 35 ]. The high prevalence of IWFs in our study exposed the lack of preparation and time devoted by evaluators for MCQ construction. The most prevalent types of flaws in SBA questions were poorly constructed, short stems (35.70%), and negatively phrased stems (33.3%). Furthermore, all EMQ had flaws, and option flaws were the dominating type of flaws (100.00% no logical order, 25.00% nonhomogeneous, and 25.00% complex option). These findings were consistent with other studies [ 13 , 35 ].

The presence of IWFs had a negative effect on the performance of high-achieving students, giving an advantage to borderline students who probably relied on testwiseness [ 36 ]. According to Downing, MCQ tests are threatened by two factors: construct-irrelevant variance (CIV) and construct underrepresentation (CUR). Construct-irrelevant variance (CIV) is the incorrect inflation or deflation of assessment scores caused by certain types of uncontrolled or systematic measurement error. Construct underrepresentation (CUR), which is the cognitive domain’s down sampling. Flawed MCQs tend to be ambiguous, unjustifiably difficult, or easy. This is directly related to the CIV added to a test due to flawed MCQs. CUR takes place when many of the test items are written to assess low levels of the cognitive domain, such as recall of facts [ 37 ]. All defective items found by quantitative item analysis should be analyzed for the presence of item writing flaws. Those defective items need to be correctly reconstructed; validated and feedback should be given to the item’s authors for corrective action. Both quantitative and qualitative item analysis are necessary for the validation of viable question banks in undergraduate medical education programs [ 38 ].

Limitations and delimitations


Subjectivity in Qualitative Analysis: While the qualitative item analysis provided valuable insights, it is inherently subjective. Different assessors might have different interpretations of item clarity, cognitive level, and writing flaws. This subjectivity could potentially impact the consistency of the analysis.

Scope of the Study: The study was limited to a single summative undergraduate pediatrics MCQ exam. Therefore, the findings may not be generalizable to other exams or disciplines.

Sample Size: The study’s conclusions are based on the analysis of a single exam. A larger sample size, including multiple exams over a longer period, might provide more robust and reliable findings.


Focus on MCQs: The study was delimited to two types of multiple-choice questions (MCQs). Other types of questions, such as short answer or essay questions, were not included in the analysis.

Single medical school Study: The study was conducted within a medical school, which may limit the generalizability of the findings to other medical schools with different student populations or assessment practices.

Despite these limitations and delimitations, the study provides valuable insights into the importance of both quantitative and qualitative item analysis in ensuring the validity and reliability of exams and in building a robust question bank. Future research could aim to address these limitations and delimitations to further enhance the quality of MCQ assessment in undergraduate medical education.


In summary, item analysis is a vital procedure to ascertain the quality of MCQ assessments in undergraduate medical education. We demonstrated that quantitative item analysis can yield valuable data about the psychometric properties of each item. Furthermore, it can assist us in selecting “ideal MCQs” for the question bank. Nevertheless, quantitative item analysis is insufficient by itself. We also require qualitative item analysis to detect and rectify flawed items. We discovered that numerous items had satisfactory indices but were inadequately constructed or had a low cognitive level. Hence, both quantitative and qualitative item analysis can enhance the validity of MCQ assessments by making informed judgments about each item and the assessment as a whole.

Availability of data and materials

Primary data are available from the corresponding author upon reasonable request.


multiple choice question

item analysis

item difficulty

item discrimination

distractor efficiency

single best answer

extended matching questions

item writing flaws.

CPM VDV. The assessment of professional competence: developments, research and practical implications. Adv Health Sci Educ. 1996;1(1):41–67.

Article   Google Scholar  

Kumar A, George C, Harry Campbell M, Krishnamurthy K, Michele Lashley P, Singh V, et al. Item analysis of multiple choice and extended matching questions in the final MBBS medicine and therapeutics examination. J Med Educ. 2022;21(1)

Salam A, Yousuf R, Bakar SMA. Multiple choice questions in medical education: how to construct high quality questions. Int J Human Health Sci (IJHHS). 2020;4(2):79.

Reynolds CR, Altmann RA, Allen DN. Mastering modern psychological testing. 2nd ed. Cham: Springer International Publishing; 2021.

Book   Google Scholar  

Tavakol M, Dennick R. Post-examination analysis of objective tests. Med Teach. 2011;33(6):447–58.

Article   PubMed   Google Scholar  

Rahim Hingorjo M, Jaleel F. Analysis of one-best MCQs: the difficulty index, discrimination index and distractor efficiency metabolic and hormonal interactions in hypertensive subjects view project. J Pakistan Med Assoc. 2012;62(2):142. Available from:

Google Scholar  

Tavakol M, O’Brien D. Psychometrics for physicians: everything a clinician needs to know about assessments in medical education. Int J Med Educ. 2022;13:100–6.

Article   PubMed   PubMed Central   Google Scholar  

Rush BR, Rankin DC, White BJ. The impact of item-writing flaws and item complexity on examination item difficulty and discrimination value. BMC Med Educ. 2016;16(1):250.

Kaur M, Singla S, Mahajan R. Item analysis of in use multiple choice questions in pharmacology. Int J Appl Basic Med Res. 2016;6(3):170.

Ali SH, Ruit KG. The impact of item flaws, testing at low cognitive level, and low distractor functioning on multiple-choice question quality. Perspect Med Educ. 2015;4(5):244–51.

Billings MS, Deruchie K, Hussie K, Kulesher A, Merrell J, Swygert KA, et al. Constructing written test questions for the health sciences. 6th ed. Philadelphia: National Board of Medical Examiners; 2020. Available from: (accessed 22 August 2023)

Haladyna TM, Downing SM, Rodriguez MC. A review of multiple-choice item-writing guidelines for classroom assessment. Appl Meas Educ. 2002;15(3):309–34. .

Tariq S, Tariq S, Maqsood S, Jawed S, Baig M. Evaluation of cognitive levels and item writing flaws in medical pharmacology internal assessment examinations. Pak J Med Sci. 2017;33(4):866–70.

Palmer EJ, Devitt PG. Assessment of higher order cognitive skills in undergraduate education: modified essay or multiple choice questions? Research paper. BMC Med Educ. 2007:7.

Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–74.

Article   CAS   PubMed   Google Scholar  

Case SM, Swanson DB. Extended-matching items: a practical alternative to free-response questions. Teach Learn Med. 1993;5(2):107–15.

Tavakol M, Dennick R. Post-examination interpretation of objective test data: Monitoring and improving the quality of high-stakes examinations: AMEE Guide No. 66. Med Teach. 2012 Mar;34(3).

Downing SM. Reliability: on the reproducibility of assessment data. Med Educ. 2004;38:1006–12. Available from:

Templeton GF. A two-step approach for transforming continuous variables to Normal: implications and recommendations for IS research. Commun Assoc Inf Syst. 2011;28

Tavakol M, Dennick R. Making sense of Cronbach’s alpha. Int J Med Educ. 2011;27(2):53–5.

Tighe J, McManus I, Dewhurst NG, Chis L, Mucklow J. The standard error of measurement is a more appropriate measure of quality for postgraduate medical assessments than is reliability: an analysis of MRCP (UK) examinations. BMC Med Educ. 2010;10(1):40.

Kumar D, Jaipurkar R, Shekhar A, Sikri G, Srinivas V. Item analysis of multiple choice questions: a quality assurance test for an assessment tool. Med J Armed Forces India. 2021;1(77):S85–9.

Wajeeha D, Alam S, Hassan U, Zafar T, Butt R, Ansari S, et al. Difficulty index, discrimination index and distractor efficiency in multiple choice questions. Annals of PIMS. 2018;4. ISSN:1815-2287.

Tarrant M, Ware J, Mohammed AM. An assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis. BMC Med Educ. 2009;9(1)

Tarrant M, Knierim A, Hayes SK, Ware J. The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments. Nurse Educ Pract. 2006;6(6):354–63.

Swanson DB, Holtzman KZ, Allbee K, Clauser BE. Psychometric characteristics and response times for content-parallel extended-matching and one-best-answer items in relation to number of options. Acad Med. 2006;81(Suppl):S52–5.

Case SM, Swanson DB, Ripkey DR. Comparison of items in five-option and extended-matching formats for assessment of diagnostic skills. Acad Med. 1994;69(10):S1–3.

Naeem N, van der Vleuten C, Alfaris EA. Faculty development on item writing substantially improves item quality. Adv Health Sci Educ. 2012;17(3):369–76.

Raymond MR, Stevens C, Bucak SD. The optimal number of options for multiple-choice questions on high-stakes tests: application of a revised index for detecting nonfunctional distractors. Adv Health Sci Educ. 2019;24(1):141–50.

Kilgour JM, Tayyaba S. An investigation into the optimal number of distractors in single-best answer exams. Adv Health Sci Educ. 2016;21(3):571–85.

Tarrant M, Ware J. A comparison of the psychometric properties of three- and four-option multiple-choice questions in nursing assessments. Nurse Educ Today. 2010;30(6):539–43.

Vuma S, Sa B. A descriptive analysis of extended matching questions among third year medical students. Int J Res Med Sci. 2017;5(5):1913.

Frey A, Leutritz T, Backhaus J, Hörnlein A, König S. Item format statistics and readability of extended matching questions as an effective tool to assess medical students. Sci Rep. 2022;12(1)

Eijsvogels TMH, van den Brand TL, Hopman MTE. Multiple choice questions are superior to extended matching questions to identify medicine and biomedical sciences students who perform poorly. Perspect Med Educ. 2013;2(5–6):252–63.

Downing SM. The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education. Adv Health Sci Educ. 2005;10(2):133–43.

Tarrant M, Ware J. Impact of item-writing flaws in multiple-choice questions on student achievement in high-stakes nursing assessments. Med Educ. 2008;42(2):198–206.

Downing SM. Threats to the validity of locally developed multiple-choice tests in medical education: construct-irrelevant variance and construct underrepresentation. Adv Health Sci Educ. 2002;7(3):235–41.

Article   MathSciNet   Google Scholar  

Bhat SK, Prasad KHL. Item analysis and optimizing multiple-choice questions for a viable question bank in ophthalmology: a cross-sectional study. Indian J Ophthalmol. 2021;69(2):343–6.

Download references

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB). This study received no institutional or external funding.

Author information

Authors and affiliations.

Pediatrics, Qena Faculty of Medicine, South Valley University, Qena, Egypt

Nagwan I. Rashwan

Community Medicine, Faculty of Medicine, Alexandria University, Alexandria, Egypt

Soha R. Aref

Clinical Pharmacology, Faculty of Medicine, Alexandria University, Alexandria, Egypt

Omnia A. Nayel

Medical Education, Faculty of Medicine, Alexandria University, Alexandria, Egypt

Mennatallah H. Rizk

You can also search for this author in PubMed   Google Scholar


NIR, SRA, OAN and MHR conceived and designed the study, and wrote the manuscript. NIR MHR undertook all statistical analyses. NIR, MHR designed and implemented assessment and provided data for analysis. All the authors have read and approved the final manuscript.

Corresponding author

Correspondence to Mennatallah H. Rizk .

Ethics declarations

Ethics approval and consent to participate.

Ethical approval was sought from the Faculty of Medicine, Alexandria University Human Research Ethics Committee, and the study granted exemption from human research ethics review. Students were not asked to provide consent to participate, as the study was exempt from human research ethics review.

Consent to publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit . The Creative Commons Public Domain Dedication waiver ( ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Rashwan, N.I., Aref, S.R., Nayel, O.A. et al. Postexamination item analysis of undergraduate pediatric multiple-choice questions exam: implications for developing a validated question Bank. BMC Med Educ 24 , 168 (2024).

Download citation

Received : 24 September 2023

Accepted : 08 February 2024

Published : 21 February 2024


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Single best answer questions
  • Extended matching questions
  • Item analysis
  • Item writing flaws
  • Question Bank

BMC Medical Education

ISSN: 1472-6920

writing text analysis

EU AI Act: first regulation on artificial intelligence

The use of artificial intelligence in the EU will be regulated by the AI Act, the world’s first comprehensive AI law. Find out how it will protect you.

A man faces a computer generated figure with programming language in the background

As part of its digital strategy , the EU wants to regulate artificial intelligence (AI) to ensure better conditions for the development and use of this innovative technology. AI can create many benefits , such as better healthcare; safer and cleaner transport; more efficient manufacturing; and cheaper and more sustainable energy.

In April 2021, the European Commission proposed the first EU regulatory framework for AI. It says that AI systems that can be used in different applications are analysed and classified according to the risk they pose to users. The different risk levels will mean more or less regulation. Once approved, these will be the world’s first rules on AI.

Learn more about what artificial intelligence is and how it is used

What Parliament wants in AI legislation

Parliament’s priority is to make sure that AI systems used in the EU are safe, transparent, traceable, non-discriminatory and environmentally friendly. AI systems should be overseen by people, rather than by automation, to prevent harmful outcomes.

Parliament also wants to establish a technology-neutral, uniform definition for AI that could be applied to future AI systems.

Learn more about Parliament’s work on AI and its vision for AI’s future

AI Act: different rules for different risk levels

The new rules establish obligations for providers and users depending on the level of risk from artificial intelligence. While many AI systems pose minimal risk, they need to be assessed.

Unacceptable risk

Unacceptable risk AI systems are systems considered a threat to people and will be banned. They include:

  • Cognitive behavioural manipulation of people or specific vulnerable groups: for example voice-activated toys that encourage dangerous behaviour in children
  • Social scoring: classifying people based on behaviour, socio-economic status or personal characteristics
  • Biometric identification and categorisation of people
  • Real-time and remote biometric identification systems, such as facial recognition

Some exceptions may be allowed for law enforcement purposes. “Real-time” remote biometric identification systems will be allowed in a limited number of serious cases, while “post” remote biometric identification systems, where identification occurs after a significant delay, will be allowed to prosecute serious crimes and only after court approval.

AI systems that negatively affect safety or fundamental rights will be considered high risk and will be divided into two categories:

1) AI systems that are used in products falling under the EU’s product safety legislation . This includes toys, aviation, cars, medical devices and lifts.

2) AI systems falling into specific areas that will have to be registered in an EU database:

  • Management and operation of critical infrastructure
  • Education and vocational training
  • Employment, worker management and access to self-employment
  • Access to and enjoyment of essential private services and public services and benefits
  • Law enforcement
  • Migration, asylum and border control management
  • Assistance in legal interpretation and application of the law.

All high-risk AI systems will be assessed before being put on the market and also throughout their lifecycle.

General purpose and generative AI

Generative AI, like ChatGPT, would have to comply with transparency requirements:

  • Disclosing that the content was generated by AI
  • Designing the model to prevent it from generating illegal content
  • Publishing summaries of copyrighted data used for training

High-impact general-purpose AI models that might pose systemic risk, such as the more advanced AI model GPT-4, would have to undergo thorough evaluations and any serious incidents would have to be reported to the European Commission.

Limited risk

Limited risk AI systems should comply with minimal transparency requirements that would allow users to make informed decisions. After interacting with the applications, the user can then decide whether they want to continue using it. Users should be made aware when they are interacting with AI. This includes AI systems that generate or manipulate image, audio or video content, for example deepfakes.

On December 9 2023, Parliament reached a provisional agreement with the Council on the AI act . The agreed text will now have to be formally adopted by both Parliament and Council to become EU law. Before all MEPs have their say on the agreement, Parliament’s internal market and civil liberties committees will vote on it.

More on the EU’s digital measures

  • Cryptocurrency dangers and the benefits of EU legislation
  • Fighting cybercrime: new EU cybersecurity laws explained
  • Boosting data sharing in the EU: what are the benefits?
  • EU Digital Markets Act and Digital Services Act
  • Five ways the European Parliament wants to protect online gamers
  • Artificial Intelligence Act

Related articles

Benefitting people, the economy and the environment, share this article on:.

  • Sign up for mail updates
  • PDF version


  1. How to Write a Literary Analysis Essay Step by Step

    writing text analysis

  2. All about Textual Analysis Essay Writing

    writing text analysis

  3. All about Textual Analysis Essay Writing

    writing text analysis

  4. 7+ Literary Analysis Templates

    writing text analysis

  5. How to Write an Analysis (with Pictures)

    writing text analysis

  6. How to Write a Rhetorical Analysis Essay: Outline, Steps, & Examples

    writing text analysis


  1. Writing a SUMMARY

  2. Text Analysis Presentation_ENG4U

  3. Writing text😭

  4. Explanation Text

  5. Text Analysis Presentation

  6. Text Analytics 12 Abstractive Summarization


  1. 7 Simple Techniques to Analyze Your Text for Better Writing

    7 Simple Techniques to Analyze Your Text for Better Writing Analyzing texts is a vital skill for improving writing. By examining different texts, you can learn a lot about structure, style, and content. This knowledge is key to enhancing your own writing. Understanding how authors construct their works gives you tools to develop your style.

  2. Textual Analysis

    Textual analysis is a broad term for various research methods used to describe, interpret and understand texts. All kinds of information can be gleaned from a text - from its literal meaning to the subtext, symbolism, assumptions, and values it reveals. The methods used to conduct textual analysis depend on the field and the aims of the research.

  3. How to Write a Literary Analysis Essay

    Step 1: Reading the text and identifying literary devices The first step is to carefully read the text (s) and take initial notes. As you read, pay attention to the things that are most intriguing, surprising, or even confusing in the writing—these are things you can dig into in your analysis.

  4. Analyzing a Text

    Analyzing a Text Written Texts When you analyze an essay or article, consider these questions: What is the thesis or central idea of the text? Who is the intended audience? What questions does the author address? How does the author structure the text? What are the key parts of the text? How do the key parts of the text interrelate?

  5. Analyzing a Written Text

    Analyzing a Written Text - Thomas The following set of questions is one tool you will use to analyze texts. We will use it together to analyze "In the Garden of Tabloid Delight." You may wish to employ it in the future as we analyze other texts together and as you work on your portfolio.

  6. How to Write an Analysis (with Pictures)

    2. Create an outline for your analysis. Building on your thesis and the arguments you sketched out while doing your close read of the document, create a brief outline. Make sure to include the main arguments you would like to make as well as the evidence you will use to support each argument.

  7. PDF Strategies for Essay Writing

    o If you're writing a research paper, do not assume that your reader has read all the sources that you are writing about. You'll need to offer context about what those sources say so that your reader can understand why you have brought them into the conversation. o If you're writing only about assigned sources, you will still need to provide

  8. How to Write a Rhetorical Analysis

    Revised on July 23, 2023. A rhetorical analysis is a type of essay that looks at a text in terms of rhetoric. This means it is less concerned with what the author is saying than with how they say it: their goals, techniques, and appeals to the audience.

  9. How to Write a Text Analysis

    An analysis is written in your own words takes the text apart bit by bit It usually includes very few quotes but many references to the original text. It analyzes the text somewhat like a forensics lab analyzes evidence for clues: carefully, meticulously and in fine detail.

  10. Textual Analysis

    Textual analysis is a broad term for various research methods used to describe, interpret and understand texts. All kinds of information can be gleaned from a text - from its literal meaning to the subtext, symbolism, assumptions, and values it reveals. The methods used to conduct textual analysis depend on the field and the aims of the ...

  11. 5 Ways to Analyze Texts

    Make sure you write down the main ideas and any supporting details provided by the text. For a fiction text, write down the names and basic information about characters. Additionally, make note of any symbolism and use of literary devices. For a nonfiction text, write down important facts, figures, methods, and dates. 5.

  12. Home

    Text analysis and writing analysis texts are important skills to develop as they allow individuals to critically engage with written material, understand underlying themes and arguments, and communicate their own ideas in a clear and effective manner. These skills are essential in academic and professional settings, as well as in everyday life ...

  13. How to Write Literary Analysis

    Literary analysis involves examining all the parts of a novel, play, short story, or poem—elements such as character, setting, tone, and imagery—and thinking about how the author uses those elements to create certain effects. A literary essay isn't a book review: you're not being asked whether or not you liked a book or whether you'd ...

  14. How to Write an Analytical Essay in 6 Steps

    An analytical essay is an essay that meticulously and methodically examines a single topic to draw conclusions or prove theories. Although they are used in many fields, analytical essays are often used with art and literature to break down works' creative themes and explore their deeper meanings and symbolism.

  15. Beginner's Guide to Literary Analysis

    When conducting literary analysis while reading a text or discussing it in class, you can pivot easily from one argument to another (or even switch sides if a classmate or teacher makes a compelling enough argument). But when writing literary analysis, your objective is to propose a specific, arguable thesis and convincingly defend it. In order ...

  16. Textual Analysis

    Textual Analysis Examples Step 1: Understand the Foundations of the Text The first thing you need to do when completing a textual analysis is to build a strong foundational understanding of the text. This will help you create a more nuanced and complex analysis later on! #1: Make a Plot Summary

  17. Text Analyser

    Syllables per Word Our advanced text analyser gives a much more detailed analysis of text with many more statistics. Test Your Readability. Discover how understandable your text is. Use these readability statistics to help you assess the complexity of a text and how hard it is to read and understand.

  18. Text Analysis

    Below are sample text analysis assignments: Text Analysis Papers Description "Handout" Sample Essay Assignment "Two Options" Sample Assignment "Novel Response" (Kennedy) Sample Essay Assignment "Literary Analysis Paper: Critical Comparison of Short Fiction" (Kennedy) Sample Essay Assignment "Literary Analysis Paper: Constructing a Canon" (Kennedy)

  19. Analyze My Writing

    An Online Writing Sample Content and Readability Analyzer: analyze your writing and get statistics on words you use most frequently, word and sentence length, readability, punctuation usage, and more.

  20. A Guide: Text Analysis, Text Analytics & Text Mining

    1. Definition: What is text analysis? What is the difference between text analysis, text mining and text analytics? What is the difference between text analysis and natural language processing? 2. What are the applications and use cases of text analysis? Text Analysis for Customer Service Teams Text Analytics for Marketing Teams

  21. Textual Analysis: Definition, Types & 10 Examples

    Textual analysis is a research methodology that involves exploring written text as empirical data. Scholars explore both the content and structure of texts, and attempt to discern key themes and statistics emergent from them.

  22. Text Analysis Examples to Apply to Your Data

    Text Analysis Examples. There are two main text analysis techniques that you can use - text classification and text extraction. While they can be used individually, you'll be able to get more detailed insights when you use them in unison. Let's take a look at some examples of the most popular text analysis techniques. Text classification

  23. What is Text Analysis? A Beginner's Guide

    Text analysis (TA) is a machine learning technique used to automatically extract valuable insights from unstructured text data. Companies use text analysis tools to quickly digest online data and documents, and transform them into actionable insights.

  24. Free high-quality Text Analysis Tools

    Elevate Your Writing with Prose Analyzer. Welcome to Prose Analyzer - your advanced online text analysis tool. Whether you are a student polishing an essay, an educator guiding learners, a researcher handling vast amounts of data, or a content creator shaping impactful prose, Prose Analyzer is your comprehensive solution.

  25. Postexamination item analysis of undergraduate pediatric multiple

    Item analysis (IA) is widely used to assess the quality of multiple-choice questions (MCQs). The objective of this study was to perform a comprehensive quantitative and qualitative item analysis of two types of MCQs: single best answer (SBA) and extended matching questions (EMQs) currently in use in the Final Pediatrics undergraduate exam. A descriptive cross-sectional study was conducted.

  26. EU AI Act: first regulation on artificial intelligence

    In April 2021, the European Commission proposed the first EU regulatory framework for AI. It says that AI systems that can be used in different applications are analysed and classified according to the risk they pose to users. The different risk levels will mean more or less regulation. Once approved, these will be the world's first rules on AI.

  27. Trump's new Supreme Court gambit doesn't even try to hide ...

    The ex-president asked the Supreme Court on Monday to step in to temporarily block a US Court of Appeals decision that last week eviscerated his claims that a president is effectively above the ...