Information retrieval methods, machine learning models, and humans can suffer from a failure in judging information representativeness.
We refer to this problem as information bias.
In this work, we propose a method to evaluate information bias through conjunctive fallacies.
An experimental evaluation of different state-of-the-art entity retrieval models and human-curated benchmarks shows that both methods perform poorly on judging query-entity representativeness while statistically based methods perform considerably better than humans.
@inproceedings{icsc/informationBias/2023,bibtex_show={true},title={Assessing Bias on Entity Retrieval Models through Conjunctive Fallacies},author={Edgard Marx},year={2023},booktitle={17th IEEE International Conference on Semantic Computing}}
ICSC
NatUKE: A Benchmark for Natural Product
Knowledge Extraction from Academic Literature
Paulo Viviurka do Carmo,
Edgard Marx,
Ricardo Marcacini,
Marilia Valli,
João Victor Silva e Silva,
Alan Pilon
International Conference on Semantic Computing,
2023
This work introduces a benchmark for natural product knowledge extraction from academic literature and evaluates different, state-of-the-art unsupervised embedding generation methods for this task.
We show that it can automatically extract chemical compound characteristics from academic literature with an unsupervised pipeline based on graph embedding methods.
We evaluated Four methods (DeepWalk, Node2Vec, Metapath2Vec, and EPHEN) in a similarity-based graph completion evaluation scenario.
EPHEN achieves reasonable hits@k performance at bioactivity and isolation type extraction with 0.64 when k = 5 and 0.75 when k = 1, respectively.
Meanwhile, Metapath2Vec was the best performer, but with underwhelming results, when extracting compound name and specie with 0.20 and 0.44 when k = 50, respectively.
These results show that using text data and previously extracted knowledge from the knowledge graph provides the most stable performance.
They also show us that some characteristics from these papers are more challenging to extract than others, and using the knowledge graph topology as context data helps in these scenarios.
@inproceedings{icsc/natuke/2023,bibtex_show={true},title={NatUKE: A Benchmark for Natural Product
Knowledge Extraction from Academic Literature},author={Paulo Viviurka do Carmo,
Edgard Marx,
Ricardo Marcacini,
Marilia Valli,
João Victor Silva e Silva,
Alan Pilon},booktitle={17th IEEE International Conference on Semantic Computing,},year={2023},publisher={IEEE,}}