[Z7 Beta] Random (?) sentences in the outline view
Hi all,
when working with this PDF ( https://www.degruyter.com/document/doi/10.1515/9781400856626.159/html188BET靠谱吗), I've noticed that Zotero shows an outline as available, which was already a surprise because Firefox doesn't show an outline for this document.
Apparently there is some automatic outline detection happening (which would be a great feature) that in this case is not very helpful!it just shows some apparently random sentences in the outline view:
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u5025031/oas4su82aldgac7hmwhi.png
when working with this PDF ( https://www.degruyter.com/document/doi/10.1515/9781400856626.159/html188BET靠谱吗), I've noticed that Zotero shows an outline as available, which was already a surprise because Firefox doesn't show an outline for this document.
Apparently there is some automatic outline detection happening (which would be a great feature) that in this case is not very helpful!it just shows some apparently random sentences in the outline view:
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u5025031/oas4su82aldgac7hmwhi.png
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u5025031/q0vtjjvsbbd5knuuwxar.png
Is it helpful to you if I report all the problems I'm seeing with this feature?
In this book ( http://link.springer.com/10.1007/978-3-322-80378-8), the extraction just extracts the book's title:
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u5025031/qvbuogn2t2sinkd5nyrz.png
In this text ( https://www.nomos-elibrary.de/index.php?doi=10.5771/0023-5652-2015-182-78) it just extracts the first sentence and the title:
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u5025031/btp7pt5czzr9h7hayw3g.png
In another scanned and OCRed book, it just recognizes part of the heading of one chapter title:
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u5025031/3kloi0gcn21ix2diae49.png
The detected outline:
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u2119014/a6zaj4vkkfs1pk7lge32.png
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u265723/a2jheecqqm6kijs2iis7.png
In this case, I had removed the first page of the PDF file before generating the outline to obtain this results.
If I keep the first page, here is the result:
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u265723/rpl57rjak7jf6zudcwq5.png
188BET靠谱吗Zotero 7.0.0-beta.85+c0c00a00e (64-bit)
Windows 10
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u265723/vjnz99zpq74z8fh9v9em.png
Note that it is working nicely in some cases, so it is really useful to have this feature, even if only partially working.
188BET靠谱吗Zotero 7.0.0-beta.85+c0c00a00e (64-bit)
Windows 10
188BET靠谱吗I have sent them to support@zotero.org.
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u265723/5097ld8ln3sl29v03rkg.png
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u265723/lubofuyzmziidpul78rj.png
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u265723/i1ulc6dqmnf8aqdvq3ve.png
It is extracting some useful bookmarks, but the structure is not recognized:
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u265723/k09gyqyo1sc0itv6hhtp.png
188BET靠谱吗Zotero 7.0.0-beta.87+f59a4da7f (64-bit)
Windows 10
188BET靠谱吗https://s3.amazonaws.com/zotero.org/images/forums/u5025031/4kv0unsu2103h72jibqz.png
Maybe it is because I quite often work with OCRed texts, but so far the outline detection feature has rarely been successful for me.