Self-service plagiarism checking tool for non-profit research institute
The research institute needed an at-scale check for plagiarism in articles before publishing. They required to offer a solution as self-service, browser-based, and easy to use an internal tool.
Neal Analytics built the Azure Data Lake host solution and secure access with Active Directory Federation Services (ADFS). We used Azure Cognitive Services for Optical Character Recognition (OCR) on non-text files like PDF, JPG, TIFF, and for Plagiarism API. An easy-to-use, web-based UX was developed that helped upload documents and share or export results.
Using Azure Data Lake host solution, any authenticated user could upload documents from any browser and user set-up confidence level goal (default 60%). Azure Cognitive Services and Web-based UX provided:
- % text likely coming from existing sources (plagiarism)
- Top 10 sources (with associated %) URLs
- Color-coded sentences associated with URLs
- Export as pdf of the full report, including color-coding and URLs.