• Home
  • RSS Feeds
  • Blog Archives
Subscribe to Disputing
Book an ADR Service
Call Karl Bayer
Karl Bayer's Disputing Blog - Mediator, Arbitrator, Court Master & Technical Advisor
About Karl  |  Book an ADR Service  |  Contact Karl   (214) 891-4505

Menu 
  • home
  • Mediation
  • Arbitration
  • Court Neutrals
  • Online Dispute Resolution
  • Technology
    • Intellectual Property
    • Privacy and Cybersecurity
    • E-discovery
  • Court Decisions
    • Texas Supreme Court
    • Fifth Circuit
    • Third Court of Appeals
    • U.S. Supreme Court
  • More
    • Legislation
      • Texas
      • United States
    • Healthcare
    • Guest Posts
      • John DeGroote
      • John C. Fleming
      • Rick Freeman
      • Professor Peter Friedman
      • Honorable W. Royal Furgeson, Jr.
      • James M. Gaitis
      • Laura A. Kaster
      • Professor John Lande
      • Philip J. Loree, Jr.
      • Michael McIlwrath
      • F. Peter Phillips
      • Professor Alan Scott Rau
      • Professor Thomas J. Stipanowich
      • Professor S.I. Strong
      • Richard Webb
      • Glen M. Wilkerson
    • International arbitration
    • Regulation
    • Sports and Entertainment


E-Discovery and the Enron E-mail Dataset Research

0
by Victoria VanBuren

Tuesday, Oct 27, 2009


Tweet

Last week I had the honor to guest-post at Peter Vogel‘s impressive Internet, Information Technology, & e-Discovery Blog. As some readers may know, Mr. Vogel is a frequent guest-blogger here at Disputing.

Check out my guest blog!


E-Discovery and the Enron E-mail Dataset Research

By Victoria VanBuren, October 21, 2009.

The U.S. Supreme Court granting of certiorari to former Enron CEO Jeffrey Skilling dominated the news headlines last week. Interestingly, the Federal Energy Commission (FERC), during its investigation into Enron’s involvement in the energy crisis of 2000-01, made available to the public a large database, called the “Enron Corpus.” This dataset consists of about half a million e-mail communications from former Enron senior executives and energy traders.

Enron E-mail Dataset Research

Because of its size and public status, the Enron Corpus is a rare and valuable tool for experimenting on text classification methods. After FERC posted it to the web, this dataset has been the subject of research by computer science departments of several universities, including the Massachusetts Institute of Technology and Stanford University. The summer of 2009, the team at TREC Legal Track, an organization co-sponsored by the U.S. Department of Defense, started conducting research on the Enron Corpus with the purpose of improving large-scale search techniques.

Our Research – Bayesian Text Classifier

The spring of 2009, computer science students at Texas State University David Villarreal, Thomas McMillen, Andrew Minnick, and I, under the supervision of computer forensic expert Wilbon Davis utilized the Enron Corpus to train a Bayes-based algorithm to classify the Enron e-mails into relevant and irrelevant to a given legal issue. This type of algorithm is commonly used by e-mail spam filters.

The Results

The team hoped that this mathematical approach would achieve better accuracy levels than the ~ 20% found using Boolean keyword searching, a method employed by many lawyers. Surprisingly, the Bayesian filter found e-mails to be known relevant at averages ranging between 43% and 66%. And as expected, the irrelevant accuracy results were even higher, averages ranging between 44% and 77%. Texas State University published the Technical Report last week and it can be downloaded for free here.


Technorati Tags: law, e-discovery

Related Posts

  • New York’s Commercial Dispute Rocket DocketNew York’s Commercial Dispute Rocket Docket
  • GUEST-POST | Mediating eDiscovery Disputes – Allison Skinner’s Brilliant IdeaGUEST-POST | Mediating eDiscovery Disputes – Allison Skinner’s Brilliant Idea
  • Supreme Court of Guam Upholds Harris County, Texas Court’s Order Confirming Arbitration AwardSupreme Court of Guam Upholds Harris County, Texas Court’s Order Confirming Arbitration Award
  • Mandatory Arbitration and the Market for ReputationMandatory Arbitration and the Market for Reputation
  • SCOTUS Holds Class Arbitration Must be Explicitly Provided for in AgreementSCOTUS Holds Class Arbitration Must be Explicitly Provided for in Agreement
  • San Antonio Firefighters Association and City Agree to Mediate Before Former SCOTX JusticeSan Antonio Firefighters Association and City Agree to Mediate Before Former SCOTX Justice

Like this article? Share it!


  • Click to share on LinkedIn (Opens in new window)
    LinkedIn

  • Click to share on X (Opens in new window)
    X

  • Click to share on Facebook (Opens in new window)
    Facebook

  • Click to share on Pinterest (Opens in new window)
    Pinterest

  • Click to email a link to a friend (Opens in new window)
    Email
About Victoria VanBuren

Born and raised in Mexico, Victoria is a native Spanish speaker and a graduate of the Monterrey Institute of Technology (Instituto Tecnologico y de Estudios Superiores de Monterrey), or "the MIT of Latin America." She concentrated in physics and mathematics. Immediately after completing her work at the Institute, Victoria moved to Canada to study English and French. On her way back to Mexico, she landed in Dallas and managed to have her luggage lost at the airport. Charmed by the Texas hospitality, she decided to stay and made her way back to Austin, which she's adopted as home.

Legal Research

Legal Research

Connect with Disputing

Visit Us On LinkedinCheck Our Feed

About Disputing

Disputing is published by Karl Bayer, a dispute resolution expert based in Austin, Texas. Articles published on Disputing aim to provide original insight and commentary around issues related to arbitration, mediation and the alternative dispute resolution industry.

To learn more about Karl and his team, or to schedule a mediation or arbitration with Karl’s live scheduling calendar, visit www.karlbayer.com.

About Disputing

Disputing is published by Karl Bayer, a dispute resolution expert based in Austin, Texas. Articles published on Disputing aim to provide original insight and commentary around issues related to arbitration, mediation and the alternative dispute resolution industry.

To learn more about Karl and his team, or to schedule a mediation or arbitration with Karl’s live scheduling calendar, visit www.karlbayer.com.

Recent Posts

We're Back!!!!
Feb 24, 2025
JAMS Welcomes Karl Bayer to its Panel of Neutrals
JAMS Welcomes Karl Bayer to its Panel of Neutrals
May 28, 2024
Class Action Waivers in Arbitration Agreements: The Twenty-First Century Arbitration Battleground and Implications for the EU Countries
Nov 27, 2023

Featured Posts

Tips on Taking Good Remote Depositions From a Veteran Court Reporter

Online Mediation May Allow Restorative Justice to Continue During COVID-19

Remote Arbitration Best Practices: Witness Examination

Search

Legal Research

Legal Research


© 2025, Karl Bayer. All rights reserved. Privacy Policy