"Guo Ke Special Article" From the Harvey and Lawsnote Judgments: Copyright Wars in the Age of AI

robot
Abstract generation in progress

The results of the first instance of the Seven Laws Lawnote copyright case shocked the legal community, and crypto lawyer Lin Hongyu (fruit shell) interpreted the law-appropriate controversy in the AI era. (Synopsis: Lawsnote crawler sentenced to 4 years + a fine of more than 100 million!) Why the founders sigh that "doing new creations in Taiwan is sadder than fraud") (Background supplement: Taiwan's first data crawler sentencing" legal version of Google "Seven Laws Lawsnote" uses the right method source information, 2 founders rarely sentenced 4 years + fine 100 million yuan) Harvey, a tool invested by OpenAI and regarded as the world's leading legal AI solution, is not only in many top law firms, but also billed as the next revolutionary product to change the legal industry. But behind this powerful technology, one thing is particularly low-key: "Harvey has not integrated the judgments of Lexis, Westlaw, or any other commercial law database." Yes, is it reasonable that a legal AI tool does not directly reference the largest legal database in the United States? It's not that technology can't do it, it's that they don't dare. Harvey's design has a key strategic arrangement - the system requires users to upload the content of the found judgment, or enter specific legal provisions and judgment information before they can analyze it. In other words, the model database itself does not actively contain the data content of Lexis or Westlaw, and even deliberately avoids the abstract language and classification architecture of the judgment data. Why is that? Because the global legal database is "super fierce". This is not only the common sense of the legal AI industry, but also has long been specifically manifested in international judicial practice. For example, Thomson Reuters v. Ross Intelligence, a representative case against a legal AI startup for a database platform claiming its "editorial copyright". Westlaw v. Ross Intelligence: How do legal databases litigate "edited works"? In that case, Thomson Reuters claimed copyright in the "Original and Revised Collection of Legal Materials" on Westlaw's legal platform, including its designs Headnotes (Summary of Judgment Focus) and Key Number System (Classification Label Architecture). Ross Intelligence has developed a natural language search engine for a legal AI startup that automatically answers legal questions and cites judgment materials. Part of the forensic text that Ross uses to train its AI search engine is Headnotes and classification tag data from the Westlaw system. Thomson Reuters filed a lawsuit alleging that Ross's unauthorized reproduction and exploitation of his original abstract and classification structure constituted editorial copyright infringement. This proposition does not protect the original text of the judgment, but the commercial editing behavior of the database platform on the rearrangement, repackaging and reclassification of public materials. This lawsuit shows that in the context of the increasing popularity of AI and data learning, the copyright boundary advocated by database platforms is changing from "content owner" to "structural monopoly". Harvey knows this so much that its product architecture is designed to be risk-isolated, avoiding touching any datasets that are already protected by editorial copyrights. It seems that Harvey's judgment is really correct, because whether in the United States or Taiwan, the database is a lawsuit against potential competitors, and Taiwan's legal source database is no exception. The conflict between legal sources and lawsnote: market normality Taking the controversial Lawsnote judgment as an example, Lawsnote, as an open legal data retrieval platform, is obviously in competition with the traditional paid database "legal source information" in terms of service nature and business model. Both provide functions such as judicial decision search, article search, judgment logic induction and marking, and adopt subscription fees. In this market landscape, litigation between database platforms is unfortunate, but from a practical point of view, mutual litigation between parties is a reasonable strategic option under the current system. Similar situations have been common in the technology industry: from the patent war between Apple and Samsung, to the interaction of multiple data service providers in the United States, the use of "patents" and "copyrights" as competitive weapons has long been unusual. The question is not whether the database will sue, but how the court will hear and define the legitimacy of its claims. Courts should be gatekeepers of boundaries, not releasers of data hegemony The copyright system is inherently balanced and flexible. When a creator makes a claim of infringement, the court should judge and measure it from the aspects of public interest, fair use, and innovative value. In particular, in the face of a similar "compilation structure of legal information", whether it should be regarded as the subject matter of copyright protection should be specifically analyzed to the extent of its creativity and originality, and should not be rashly presumed to constitute a protected work based on the work of compilation and annotation. In the US judgment, even if the plaintiff claims that the data classification structure is creative, the court will further examine whether the classification is truly highly original, whether there is an overlap in the public domain, and whether it substantially restricts the space for AI development. However, China's Lawsnote judgment classifies and lists the information of the platform in a highly simplified way, regards it as a highly protected object, and then awards criminal liability and huge civil compensation. It is deeply regrettable that such a treatment does not demonstrate the balance and technical understanding that the Court deserves. The boundary of editorial copyright: you can't make a creation when you organize it Personally, I always believe that the protection of so-called "editorial works" should not become a high wall for the reuse of public information information. If today it is a classification logic designed by the legal source platform, such as a compilation of selected infringement cases, for this classification, it is of course possible to advocate the protection of editorial works. However, if the judgment is only listed in the order of the law, listed by year, or the corresponding summary of the judgment is filed in the form of a provision, it should be regarded as a public domain application, not the object of protection of private property rights. Copyright should protect creations, not structured repetitions of public information. Excessive relaxation of the claim of editorial copyright will only lead to the monopoly of knowledge and the self-discipline of innovators. Conclusion: Databases have become an industrial obstacle, and courts should draw standards for the line between information use and copyright Harvey doesn't touch databases, not because he doesn't want to, but because he can't. This is not only a design choice for a single platform, but also an institutional dilemma in which the development of legal AI is hindered. What we see today is not just a Lawsnote judgment in Taiwan, but a global phenomenon: monopolistic data control of legal databases, coupled with the protection of "editorial copyright" claims under the Copyright Law, is systematically hindering the development of legal AI. In fact, platforms such as Westlaw and Lexis have already made many AI startups unable to resist and exit the market directly through legal proceedings in the United States. These databases not only do not authorize, but also claim that all classification structures and abstract texts are creative, and once used to train models, they constitute infringement, with criminal liability at worst, and sky-high compensation, completely blocking the space for innovation. Harvey is today referred to as a "legal assistant" rather than a "legal intelligence system" in large part because he can't fully study the verdicts in jurisprudence. If he is legally shipped...

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)