I believe chatgpt using GPT4 is only available to plus customers (July 2023 version) so what follows was developed using GPT3, though in my next experiment, I will be able to use GPT4 to process unstructured text in bulk using the results of today’s experiment.

In a blog post, Kingsley Uyi Idehen demonstrated his chat history showing how he developed an ontology from a mermaid diagram . This struck me as extermely valuable for us. A mermaid diagram is a textual description of a chart. Obsidian can understand mermaid diagrams and visualize them. Mermaid also has the ability to represent entity relations diagrams. I started sketching the entities and relationships in the antiquities trade in my Obsidian powered lab notebook.

Here was my initial sketch:

mermaid diagram of the antiquities trade

An initial sketch of a Mermaid erDiagram of the antiquities trade.

The underlying markdown looks like this (here’s a snippet):

erDiagram
AUCTIONHOUSE ||–o{ ARTIFACT : auctions
AUCTIONHOUSE ||–o{ PERSON: sells_to
AUCTIONHOUSE ||–o{ MUSEUM: sells_to
MUSEUM {
string name
string city
string country
}

snip!

Over in the chatgpt interface, I told it:

You are an excellent assistant with deep knowledge of research in the field of illegal and illicit antiquities. Generate an ontology using RDF-Turtle for the text that follows, using the following guidelines: 1 – Denote subjects and objects using relative hash-based hyperlinks i.e., negating the use of example.com. 2- Output response to a code-block. 3 – Place ## Turtle Start ## and ## Turtle End ## around the code within the code-block.

…and then pasted in my mermaid diagram beginning with the `erDiagram` line. ChatGPT duly transformed the mermaid code into RDF-Turle:

## Turtle Start ##

@prefix ex: <#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

# Classes
ex:AUCTIONHOUSE a rdfs:Class .
ex:ARTIFACT a rdfs:Class .
ex:PERSON a rdfs:Class .
ex:MUSEUM a rdfs:Class .
ex:ORGANIZATION a rdfs:Class .
ex:GOVERNMENT_AGENCY a rdfs:Class .
ex:GALLERY a rdfs:Class .
ex:ART_WORK a rdfs:Class .

# Object Properties
ex:auctions a rdf:Property ;
rdfs:domain ex:AUCTIONHOUSE ;
rdfs:range ex:ARTIFACT .

…and so on…

However, I know that chatgpt can/will/always introduces some errors. My next instruction was to identify and correct any syntax errors in the RDF-Turtle, which it duly did. The resulting scheme works! I then asked it to consider what elements might be missing. It provided a list of 9 possibilities under various headings. Three of them seemed reasonable to add, so I directed it to ‘Extend the ontology to take into account ‘Authorities and Enforcement’, ‘Authentication and Expertise’, and ‘Cultural Heritage and Repatriation’.’ Once it generated the ontology I again asked it to correct any syntax errors. The resulting ontology is available here.

Now the question is, can I use that ontology with chatgpt (or gpt, full stop) to render articles and information about the antiquities trade into structured data? Yes, yes I can.

I gave it this prompt, which integrates the first three paragraphs from the article about Giacomo Medici on the Trafficking Culture website:

Using the ontology exclusively, please create specific instances and data about individuals within the antiquities trade from the following encyclopedia text. Also create the RDF graph. TEXT: Giacomo Medici is an Italian antiquities dealer who was convicted in 2005 of receiving stolen goods, illegal export of goods, and conspiracy to traffic.

Medici started dealing in antiquities in Rome during the 1960s. In July 1967, he was convicted in Italy of receiving looted artefacts, though in the same year he met and became an important supplier of antiquities to US dealer Robert Hecht. In 1968, Medici opened the gallery Antiquaria Romana in Rome and began to explore business opportunities in Switzerland. It is widely believed that in December 1971 he bought the illegally-excavated Euphronios (Sarpedon) krater from tombaroli before transporting it to Switzerland and selling it to Hecht.

In 1978, he closed his Rome gallery, and entered into partnership with Geneva resident Christian Boursaud, who started consigning material supplied by Medici for sale at Sotheby’s London. Together, they opened Hydra Gallery in Geneva in 1983. It has been estimated that throughout the 1980s Medici was the source of more consignments to Sotheby’s London than any other vendor. At any one time, Boursaud might consign anything up to seventy objects, worth together as much as £500,000. Material would be delivered to Sotheby’s from Geneva by courier.

In October 1985, the Hydra Gallery sold fragments of the Onesimos kylix to the J. Paul Getty Museum for $100,000, providing a false provenance by way of the fictitious Zbinden collection, a provenance that was sometimes used for material offered at Sotheby’s. The Getty returned the kylix to Italy in 1999.

The result:

## Turtle Start ##

@prefix ex: <#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

# Individuals
ex:Giacomo_Medici rdf:type ex:PERSON ;
ex:name “Giacomo Medici” ;
ex:role “antiquities dealer” ;
ex:nationality “Italian” ;
ex:convicted_of “receiving stolen goods, illegal export of goods, and conspiracy to traffic” ;
ex:started_dealing_in “1960s” .

ex:Robert_Hecht rdf:type ex:PERSON ;
ex:name “Robert Hecht” ;
ex:role “US dealer” .

ex:Christian_Boursaud rdf:type ex:PERSON ;
ex:name “Christian Boursaud” ;
ex:role “Geneva resident” .

ex:Hydra_Gallery rdf:type ex:GALLERY ;
ex:name “Hydra Gallery” ;
ex:city “Geneva” ;
ex:country “Switzerland” .

ex:Sothebys_London rdf:type ex:AUCTIONHOUSE ;
ex:name “Sotheby’s London” ;
ex:city “London” ;
ex:country “United Kingdom” .

ex:Euphronios_krater rdf:type ex:ARTIFACT ;
ex:from_culture “Euphronios (Sarpedon) krater” ;
ex:dates_from “December 1971” .

ex:Onesimos_kylix rdf:type ex:ARTIFACT ;
ex:from_culture “Onesimos kylix” ;
ex:dates_from “October 1985” .

ex:J_Paul_Getty_Museum rdf:type ex:MUSEUM ;
ex:name “J. Paul Getty Museum” ;
ex:city “Los Angeles” ;
ex:country “United States” .

# Relationships
ex:Giacomo_Medici ex:sells_to ex:Robert_Hecht ;
ex:sells_to ex:Sothebys_London ;
ex:buys ex:Euphronios_krater ;
ex:buys ex:Onesimos_kylix .

ex:Christian_Boursaud ex:works_with ex:Giacomo_Medici ;
ex:sells_to ex:Sothebys_London .

ex:Hydra_Gallery ex:has_possesion_of ex:Euphronios_krater ;
ex:has_possesion_of ex:Onesimos_kylix .

ex:Onesimos_kylix ex:authenticated_by ex:Getty_Expertise_Institution .

ex:Euphronios_kylix ex:repatriated_to ex:Italy .

## Turtle End ##

When visualized, it looks pretty darn good, though ‘Getty_Expertise_Institution’ is a bit funny. Presumably, the larger context window and abilities of GPT4 would catch that sort of thing.

Visualization of the graph from the Medici article

All in all, I’m pretty happy with this. We’re a couple of steps closer now to being able to represent our data as Linked Open Data, and I think that when these subject – predicate – object relationships through our knowledge graph embedding model pipeline, we’ll be able to generate better leads, better tips, better predictions.