ToTTo: Open-domain English table-to-text dataset

0
TAGS: ,

ToTTo: Open-domain English table-to-text dataset

google-research-datasets/ToTTo

“ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description.

During the dataset creation process, tables from English Wikipedia are matched with (noisy) descriptions. Each table cell mentioned in the description is highlighted and the descriptions are iteratively cleaned and corrected to faithfully reflect the content of the highlighted cells.

We hope this dataset can serve as a useful research benchmark for high-precision conditional text generation…”

Source: github.com/google-research-datasets/ToTTo

May 3, 2021
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments

Subscribe to our Digest