- 
                Notifications
    
You must be signed in to change notification settings  - Fork 12
 
Description
Hello!
First, thanks so much for making this and the accompanying software-mentions-client library, I really appreciate all of the work that has gone in to making this available!
I am looking for some help regarding detecting and extracting software sharing URLs. For example, in the CZI Software Mentions Dataset paper, there are lots of pieces of software mentioned and correctly found but it doesn't seem to find this link/passage:
This model achieves a 10-fold cross-validation F1 score of 0.92. More details can be found at: https://github.com/chanzuckerberg/software-mention-extraction.
or:
All the code used for extraction, disambiguation and linking, as well as instructions on how to reproduce the results and some starter code is available at a GitHub repository https://github.com/chanzuckerberg/software-mentions under the MIT license with the permanent snapshot at [43].
I was hoping that I would be able to extract those links as that is the authors trying to share their software but I think I may be doing something wrong or have the deployment configured incorrectly.
Another example can be found in the Rise of Open Science paper. This time the service finds very little software (which is I think to be expected) but also doesn't find the authors sharing their code in the following footnote:
The codes can be accessed at https://github.com/caohanch/paper_data_method_sharing/.
I am using the grobid/software-mentions:0.8.1 docker image and I haven't changed any of the configuration details because I already saw in the README:
It is recommended to use the Docker image for running the service. The best Deep Learning models are included and are used by default by this image.
Please let me know if you have any thoughts/ideas/etc. Any help is greatly appreciated, just confused if I am doing something incorrectly.