Roblog

15 posts about ai

  • In a lengthy article that is my new gold standard for analysing the actual consequences of introducing AI into an industry, Justin Curl, Sayash Kapoor, and Arvind Narayanan explore the effects of AI on legal work.

    Sayash and Arvind had previously authored AI as Normal Technology (also well worth reading), which argues that it’s a mistake by both boosters and doomsters to treat AI as some special category of thing that will lead to humanlike or superhuman intelligence; instead, we should treat it, and should remain in control of it, like any other technology. In particular, we should measure its impact by how it diffuses through society and industry, not by what it’s capable of in isolation, and we should expect that diffusion to take decades, not months.

    This article is their attempt to apply the “AI as normal technology” thinking to a particular industry.

    As a document- and language-focused industry, law has seemed ripe for LLM-based disruption; as an industry that often commands extremely high fees for its work, many people are desperate for that disruption to happen, and for legal work to become cheaper. That motivated reasoning is perhaps one of the reasons why law regularly tops the panic-inducing lists of “jobs that won’t exist in the future because of AI”.

    The authors’ conclusions are much more nuanced. They look at regulatory concerns, which might inhibit AI’s progress into the industry; they look the fundamental dynamics of the industry, in this case the adversarial nature of common-law countries; they look at the impact of past technologies, which often failed to deliver the commodification of law that many people imagined.

    AI, they conclude, won’t fix any of the structural problems faced by the legal industry and those who interact with it as clients, and may even magnify them; but it may be that the mere threat of AI disruption can be an external impetus to reform, reform that otherwise might have faltered through lack of will and coordination.

    I think both the specific conclusions about law and the general framework outlined by the authors are enormously helpful contributions to the discourse around AI, and I’ll be trying to think along similar lines in my own work. #

  • The New York Times has developed a tool to download, transcribe, and summarise various right-wing podcasts, part of what they call the “manosphere”, in order to spot signs of division and discontent within Donald Trump’s base:

    “When one of the shows publishes a new episode, the tool automatically downloads it, transcribes it, and summarizes the transcript. Every 24 hours the tool collates those summaries and generates a meta-summary with shared talking points and other notable daily trends. The final report is automatically emailed to journalists each morning at 8 a.m. ET. Currently, the tool is used by nearly 40 reporters across the newsroom.”

    It’s a fascinating use of LLMs in the newsroom. Whether this ends up making bias worse, as the apparent consensus of grifters and racists becomes a signal that influences the Times’ own reporting, is a complicated but vital question. But this specific use-case is just one of many. The tool grew out of another, called Cheatsheet, which sounds essentially like a no-code tool that allows journalists to execute LLM-based workflows against custom data, and which has already led to significant investigative breakthroughs:

    “The Initiatives Team started trialing other applications of LLMs to process large, messy datasets and file dumps on a case-by-case basis. Today, many of those live in a single spreadsheet-based tool. Reporters can drop datasets into Cheatsheet and then run different preset scripts and prompts. Each capability in the menu is known as a ‘recipe.’ Some of those recipes, like transcribing thousands of hours of video footage and summarizing transcriptions, are foundational to the Manosphere Report.

    “Still in its beta, Cheatsheet has already been tested on about 300 users in the newsroom, with 50 of those being ‘really active users’, according to Seward. Right now, at least one new project is created in Cheatsheet every day. The tool has been used to investigate an election-interference group, to transcribe and translate Syrian prison records, and to find recent instances of Trump talking about Jan. 6. At times, Cheatsheet has even been used to take on more thorough historical analysis of podcasts.”

    #

  • Nicholas Hune-Brown in The Local exhibits more diligence than most publications in tracking down the source of an AI-written pitch:

    “I was embarrassed. I had been naively operating with a pre-ChatGPT mindset, still assuming a pitch’s ideas and prose were actually connected to the person who sent it. Worse, the reason the pitch had been appealing to me to begin with was likely because a large language model somewhere was remixing my own prompt asking for stories where ‘health and money collide,’ flattering me by sending me back what I wanted to hear.”

    Such grubby scams are, he thinks, reflective of the diminished media environment we live in:

    “Every media era gets the fabulists it deserves. If Stephen Glass, Jayson Blair and the other late 20th century fakers were looking for the prestige and power that came with journalism in that moment, then this generation’s internet scammers are scavenging in the wreckage of a degraded media environment. They’re taking advantage of an ecosystem uniquely susceptible to fraud – where publications with prestigious names publish rickety journalism under their brands, where fact-checkers have been axed and editors are overworked, where technology has made falsifying pitches and entire articles trivially easy, and where decades of devaluing journalism as simply more ‘content’ have blurred the lines so much it can be difficult to remember where they were to begin with.”

    Immunisation against these sorts of scams and misinformation is possible, as Hune-Brown demonstrates with his dogged investigative work. But the skills needed are precisely the ones that the media have spent the last decade or two gutting. It does not bode well. #

  • I wrote earlier this year about potential use-cases for AI, including things like “the rubber duck” and “the tireless intern”. In this post, Drew Breunig outlines three much clearer and more tangible categories of use-case: Gods (superintelligences that can replace humans and act autonomously), Interns (copilots that work under close human supervision), and Cogs (small, self-contained, automated functions that operate as part of a larger process). It’s a great classification. #

  • Paul Ford is wonderful on the shameless usefulness of AI:

    “So I should reject this whole crop of image-generating, chatting, large-language-model-based code-writing infinite typing monkeys. But, dammit, I can’t. I love them too much. I am drawn back over and over, for hours, to learn and interact with them. I have them make me lists, draw me pictures, summarize things, read for me. Where I work, we’ve built them into our code. I’m in the bag. Not my first hypocrisy rodeo.”

    #

  • I wrote a few weeks ago about use cases for AI. In a similar vein is this thoughtful piece from the New York Times’ Zach Seward on the role of AI in responsible, thoughtful journalism.

    I love the sentiment that AIs are actually often more useful when they’re not being creative, but instead are interpreting creativity and translating it into something more rigid:

    “People look at tools like ChatGPT and think their greatest trick is writing for you. But, in fact, the most powerful use case for LLMs is the opposite: creating structure out of unstructured prose. [This gives] us a sense of the technology’s greatest promise for journalism (and, I’d argue, lots of other fields). Faced with the chaotic, messy reality of everyday life, LLMs are useful tools for summarizing text, fetching information, understanding data, and creating structure.”

    #

  • A thoughtful and pragmatic post from the fine-art-trained – but technology-savvy – Sam Bleckley, on the limitations and the plausible future usage of generative AI for illustration.

    “This doesn’t mean illustrators will stop drawing and become prompt engineers. That will waste an immense amount of training and gain very little. Instead, I foresee illustrators concentrating even more on capturing the core features of an image, letting generative AI fill in details, and then correcting those details as necessary.”

    #

  • Cory Doctorow famously coined the term “enshittification”, to describe the process by which online platforms – from a combination of apathy and cynicism – tended to start out useful and then eventually become cesspools of awfulness.

    Gary Marcus observes the way that the muckspreaders that are LLMs have gone from covering the internet in a light spray to a gushing torrent. Search engines, social platforms, digital goods; all are becoming less and less useful as they digest and regurgitate incorrect, AI-generated information.

    “Cesspools of automatically-generated fake websites, rather than ChatGPT search, may ultimately come to be the single biggest threat that Google ever faces. After all, if users are left sifting through sewers full of useless misinformation, the value of search would go to zero – potentially killing the company.

    “For the company that invented Transformers – the major technical advance underlying the large language model revolution – that would be a strange irony indeed.”

    Related: Maggie Harrison’s recent “When AI is trained on AI-generated data, strange things start to happen”. #

  • Drew Breunig compares AI to a platypus – usefully, as it happens:

    “When trying to get your head around a new technology, it helps to focus on how it challenges existing categorizations, conventions, and rule sets. Internally, I’ve always called this exercise, ‘dealing with the platypus in the room.’ Named after the category-defying animal; the duck-billed, venomous, semi-aquatic, egg-laying mammal.

    “There’s been plenty of platypus over the years in tech. Crypto. Data Science. Social Media. But AI is the biggest platypus I’ve ever seen… Nearly every notable quality of AI and LLMs challenges our conventions, categories, and rulesets.”

    #

  • A couple of months ago a video did the rounds of David Guetta, who’d used AI to conjure up a realistic-sounding sample of Eminem. It was interesting, but also pretty meh: it sounded like Eminem, sure, but the lyrics were nonsensical and it all had a slightly uncanny feel about it. It felt like the jobs of rappers were safe for now.

    Then, a couple of weeks ago, hip-hop duo Alltta released the song Savages, and everything changed. It illustrated how far things have come in just a couple of months, but also how incredible human-AI collaborations could be: it features lyrics by rapper Mr. J Medeiros, delivered in the unmistakeable flow of Jay-Z, backed by a genuinely good beat. It’s amazing and scary in equal measure.

    Over at BuzzFeed News (RIP), Chris Stokel-Walker takes a tour through some of the recent developments in AI-generated hip-hop, and delves into the legal issues that are looming:

    “While a consensus is forming that generative AI is potentially troublesome, no one really knows whether hobbyist creators are on shaky legal ground or not. pieawsome said he thinks of what he does as the equivalent of modding a game or producing fanfiction based on a popular book. ‘It’s our version of that,’ he said. ‘That may be a good thing. It may be a bad thing. I don’t know. But it’s kind of an inevitable thing that was going to happen.’”

    #

  • Izzy Miller trained a large language model – similar to GPT – on the entire history of his friends’ group chats, which had been running for years. He then hosted an interactive version of it for his friends, so they could all chat with the AI versions of themselves. It worked surprisingly well:

    “This has genuinely provided more hours of deep enjoyment for me and my friends than I could have imagined. Something about the training process optimized for outrageous behavior, and seeing your conversations from a third-person perspective casts into stark relief how ridiculous and hilarious they can be.”

    The post contains lots of technical details, if you have the urge to do something similar yourself. #

  • The great linguist Noam Chomsky outlines his frustrations with the buzz around generative AI: principally, that it might obscure the wonder of humanity and our incredible real intelligence.

    “The human mind is not, like ChatGPT and its ilk, a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating the most likely conversational response or most probable answer to a scientific question. On the contrary, the human mind is a surprisingly efficient and even elegant system that operates with small amounts of information; it seeks not to infer brute correlations among data points but to create explanations.

    “Indeed, such programs are stuck in a prehuman or nonhuman phase of cognitive evolution. Their deepest flaw is the absence of the most critical capacity of any intelligence: to say not only what is the case, what was the case and what will be the case – that’s description and prediction – but also what is not the case and what could and could not be the case. Those are the ingredients of explanation, the mark of true intelligence.”

    #

  • An interesting paper from law professor Michael D. Murray that investigates the question of who owns artworks that are generated by computers and AIs:

    “Artists and creatives who mint generative NFTs, collectors and investors who purchase and use them, and art law attorneys all should have a clear understanding of the copyright implications involved with different forms of generative art. This guide seeks to educate each of these audiences and bring clarification to the issues of copyrights in the world of generative art NFTs.”

    He concludes that in many cases they are uncopyrightable – including lots that are the basis for lucrative NFT projects. #

  • Part of the skill of augmented creativity will be writing good prompts for the AI to follow. With that in mind, Guy Parsons of DALL•Ery GALL•Ery showcases some interesting examples, and offers advice about what seems to work and how.

    One of the most interesting sections is on the ethical quandaries thrown up by DALL•E’s ability to replicate the styles of individual artists:

    “Artists need to make a living. After all, it’s only through the creation of human art to date DALL•E has anything to be trained on! So what becomes of an artist, once civilians like you and I can just conjure up art ‘in the style of [artist]’?

    Van Gogh’s ghost can surely cope with such indignities – but living artists might feel differently about having their unique style automagically cloned.”

    There are no easy answers, morally or legally. #

  • Another great usage of GPT-3, like I’ve written about before. This time, it’s for jargon-busting: insert any text, from any field, and it will translate things into plain English.

    For example, this:

    “The three-dimensional structure of DNA – the double helix – arises from the chemical and structural features of its two polynucleotide chains.”

    …becomes:

    “The shape of DNA is called a double helix, and it forms due to the chemical makeup and structural characteristics of the two different strings of molecules that make up DNA.”

    Pretty interesting. #