Writing Translator Code
Below we will describe how thedetect*
anddo*
188BET靠谱吗functions of Zoterotranslatorscan and should be coded.If you are unfamiliar with JavaScript, make sure to check out aJavaScript tutorialto get familiar with the syntax.In addition to the information on this page, it can often be very informative to look at existing translators to see how things are done.Aparticularly helpful guide188BET靠谱吗with up-to-date recommendation on best coding practices is provided by the Wikimedia Foundation, whose tool Citoid uses Zotero translators.
While translators can be written with any text editor, the built-inTranslator Editorcan make writing them much easier, as it provides the option to test and troubleshoot translators relatively quickly.
New web translators should use the Translator Editor's web translator template as a starting point.The template can be inserted in the Code tab: click the green plus dropdown and choose "Add web translator template".
Web Translators
detectWeb
detectWeb
is run to determine whether item metadata can indeed be retrieved from the webpage.The return value of this function should be the detected item type (e.g."journalArticle", see the188BET靠谱吗overview of Zotero item types), or, if multiple items are found, "multiple".If no item(s) can be detected on the current page, return false.
detectWeb
receives two arguments: the webpage document object andURL(typically nameddoc
andurl
).In some cases, theURLprovides all the information needed to determine whether item metadata is available, allowing for a simpledetectWeb
function, e.g.(example fromCell Press.js
):
functiondetectWeb(doc,url){if(url.includes("search/results")){return"multiple";}elseif(url.includes("content/article")){return"journalArticle";}returnfalse;}
doWeb
doWeb
is run when a user, wishing to save one or more items, activates the selected translator.It can be seen as the entry point of the translation process.
The signature ofdoWeb
should be
doWeb(doc,url)
Heredoc
188BET靠谱吗refers to the DOM object of the web page that the user wants to save as a Zotero item, andurl
is the page'sURLas a string.
In this section, we will describe the common tasks in the translation workflow started bydoWeb()
.
Saving Single Items
Scraping for metadata
188BET靠谱吗"Scraping" refers to the act of collecting information that can be used to populate Zotero item fields from the web page.Such information typically include the title, creators, permanentURL, and source of the work being saved (for example, the title/volume/pages of a journal).
Having identified what information to look for, you need to know where to look.The best way to do this is to use the web inspections tools that come with the browser (Firefox,Chromium-based, andWebkit/Safari).They are indispensable for locating the DOM node /HTMLelement – by visual inspection, searching, or browsing the DOM tree.
To actually retrieve information from the nodes in your translator code, you should be familiar with the use ofselectors, in the way they are used with the JavaScriptAPIfunctionquerySelectorAll()
.
Most often, you will do the scraping using the helper functionstext()
andattr()
, for retrieving text content and attribute value, respectively.In fact, these two actions are performed so often, thattext()
andattr()
are available to the translator script as top-level functions.
functiontext(parentNode,selector[,index])functionattr(parentNode,selector,attributeName[,index])
-
text()
finds the descendant ofparentNode
(which can also be a document) that matchesselector
, and returns the text content (i.e.the value of thetextContent
property) of the selected node, with leading and trailing whitespace trimmed.If the selector doesn't match, the empty string is returned.
-
attr()
similarly uses the selector to locate a descendant node.However, it returns the value of the HTMLattributeattributeName
on that element.If the selector doesn't match, or if the there's no specified attribute on that element, the empty string is returned.
Optionally, a numberindex
(zero-based) can be used to select a specific node when the selector matches multiple nodes.If the index is out of range, the return value of both function will be the empty string.
Another less-used helper functioninnerText()
has the same signature astext()
, but it differs from the latter by returning the selected node'sinnerText
value, which is affected by how the node's content would have been rendered.
In addition, you can always use theAPIfunctionsquerySelector
andquerySelectorAll
directly, but the helper functions should be preferred when they are adequate for the job.
In some older translator code, you are likely to encounter node-selection expressed by XPath.Although XPath has its uses, for the most common types of scraping the selector-based functions should be preferred because of the simpler syntax of selectors.
Metadata
The first step towards saving an item is to create an item object of the desireditem type(examples from "NCBI PubMed.js"):
varnewItem=new188BET靠谱吗Zotero.Item("journalArticle");
Metadata can then be stored in the properties of the object.Of the different fields available for the chosen item type (see theField Index), only the title is required.E.g.:
vartitle=article.ArticleTitle.text().toString();newItem.title=title;varPMID=citation.PMID.text().toString();newItem.url="http://www.ncbi.nlm.nih.gov/pubmed/"+PMID;
After all metadata has been stored in the item object, the item can be saved:
newItem.complete();
This process can be repeated (e.g.using a loop) to save multiple items.
Attachments
Attachments may be saved alongside item metadata via the item object'sattachments
property.Common attachment types are full-text PDFs, links and snapshots.An example from "Pubmed Central.js":
varlinkURL="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC"+ids[i]+"/";newItem.attachments=[{url:linkURL,title:"PubMed Central Link",mimeType:"text/html",snapshot:false}];varpdfURL="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC"+ids[i]+"/pdf/"+pdfFileName;newItem.attachments.push({title:"Full Text PDF",mimeType:"application/pdf",url:pdfURL});
An attachment can only be saved if the source is indicated.The source is often aURL(set on theurl
property), but can also be a file path (set onpath
) or a document object (set ondocument
).Other properties that can be set aremimeType
("text/html" for webpages, "application/pdf" for PDFs),title
, andsnapshot
(if the latter is set tofalse
, an attached webpage is always saved as a link).
In the very common case of saving the current page as an attachment, setdocument
188BET靠谱吗to the current document, so that Zotero doesn't have to make an additional request:
newItem.attachments.push({title:"Snapshot",document:doc});
Whendocument
is set, the MIME type will be set automatically.
188BET靠谱吗Zotero will automatically use proxied versions of attachment URLs returned from translators when the original page was proxied, which allows translators to construct and return attachment URLs without needing to know whether proxying is in use.However, some sites expect unproxied PDF URLs at all times, causing PDF downloads to potentially fail if requested via a proxy.If a PDFURLis extracted directly from the page, it's already a functioning link that's proxied or not as appropriate, and a translator should includeproxy: false
in the attachment metadata to indicate that further proxying should not be performed:
item.attachments.push({url:realPDF,title:"EBSCO Full Text",mimeType:"application/pdf",proxy:false});
Notes
Notes are saved similarly to attachments.The content of the note, which should consist of a string, should be stored in thenote
property of the item'snotes
property.E.g.:
let bbCite="Bluebook citation: "+bbCite+".";newItem.notes.push({note:bbCite});
Related
When saving more than one item from a single source, relationships can be established between the items being saved.These relationships are established using two properties of the item object:seeAlso
anditemID
.To establish a relationship, setitemID
to some unique value on one or more of the item objects, and assign an array of the IDs of related items to theseeAlso
property of another item object.
Note:TheitemID
188BET靠谱吗used here is completely ad hoc– it has nothing to do with the internal ID that Zotero assigns items once they are saved.188BET靠谱吗Also, it is not possible to establish a relationship to an item previously saved to Zotero, since non-export translators have no access to the local library.
When the item objects are saved viaitem.complete()
, the relationships will be established.The following code illustrates a simple seeAlso relationship:
functiondoWeb(doc,url){188BET靠谱吗Zotero.debug("Simple example of setting seeAlso relations");let items=[];// Real data acquisition would happen herevartitles=["Book A","Book B"];for(let title of titles){let item=new188BET靠谱吗Zotero.Item("book");item.title=title;items.push(item);}// Assign a bogus itemID to each item in the setfor(let i=0;i<items.length;i++){items[i].itemID=""+i;}// Set bogus itemIDs in each item's seeAlso// field (skipping the item's own ID)for(let i=0;i<items.length;i++){for(let j=0;j<items.length;j++){if(i===j){continue;}items[i].seeAlso.push(""+j);}}// Save the itemsfor(let item of items){item.complete();}};
Saving Multiple Items
Some webpages, such as those showing search results or the index of a journal issue, list multiple items.188BET靠谱吗For these pages, web translators can be written to a) allow the user to select one or more items and b) batch save the selected items to the user's Zotero library.
Item Selection
To present the user with a selection window that shows all the items that have been found on the webpage, a JavaScript object should be created.Then, for each item, an item ID and label should be stored in the object as a property/value pair.The item ID is used internally by the translator, and can be aURL, DOI, or any other identifier, whereas the label is shown to the user (this will usually be the item's title).Passing the object to the188BET靠谱吗Zotero.selectItems
function will trigger the selection window, and the function passed as the second argument will receive an object with the selected items (orfalse
if the user canceled the operation), as in this example:
188BET靠谱吗Zotero.selectItems(getSearchResults(doc,false),function(items){if(items)ZU.processDocuments(Object.keys(items),scrape);});
Here,188BET靠谱吗Zotero.selectItems(..)
is called with an anonymous function as the callback.As in many translators, the selected items are simply loaded into an array and passed off to a processing function that makes requests for each of them.
Batch Saving
You will often need to make additional requests to fetch all the metadata needed, either to make multiple items, or to get additional information on a single item.The most common and reliable way to make such requests is with the utility functionsNew and updated translators should use the new async HTTP request methods.Documentation is forthcoming, but look for188BET靠谱吗Zotero.Utilities.doGet
,188BET靠谱吗Zotero.Utilities.doPost
, and188BET靠谱吗Zotero.Utilities.processDocuments
.await request
in existing translators for examples.
Import Translators
To read in the input text, call188BET靠谱吗Zotero.read()
:
varline;while((line=188BET靠谱吗Zotero.read())!==false)){// Do something}
If given an integer argument, the function will provide up to the specific number of bytes.188BET靠谱吗Zotero.read()
returns false when it reaches the end of the file.
IfdataMode
inthe translator metadatais set tordf/xml
orxml/dom
, the input will be parsed accordingly, and the data will be made available through188BET靠谱吗Zotero.RDF
and188BET靠谱吗Zotero.getXML()
, respectively.Documentation for these input modes is not available, but consult the RDF translators ("RDF.js", "Bibliontology RDF.js", "Embedded RDF.js") and XML-based translators ("MODS.js", "CTX.js") to see how these modes can be used.
Creating Collections
To create collections, make a collection object and append objects to itschildren
attribute.188BET靠谱吗Just like ordinary Zotero items, you must callcollection.complete()
to save a collection– otherwise it will be silently discarded.
varitem=new188BET靠谱吗Zotero.Item("book");item.itemID="my-item-id";// any string or numberitem.complete();varcollection=new188BET靠谱吗Zotero.Collection();collection.name="Test Collection";collection.type="collection";collection.children=[{type:"item",id:"my-item-id"}];collection.complete();
The children of a collection can include other collections.In this case,collection.complete()
should be called only on the top-level collection.
Export Translators
Export translators use188BET靠谱吗Zotero.nextItem()
and optionally188BET靠谱吗Zotero.nextCollection()
to iterate through the items selected for export, and generally write their output using188BET靠谱吗Zotero.write(text)
.A minimal translator might be:
functiondoExport(){varitem;while(item=188BET靠谱吗Zotero.nextItem()){188BET靠谱吗Zotero.write(item.title);}}
As with import translators, it is also possible to produce XML and RDF/XML using188BET靠谱吗Zotero.RDF
.See for example188BET靠谱吗Zotero RDFwhich is a RDF export translator, which also deals with collections.
Exporting Collections
IfconfigOptions
inthe translator metadatahas thegetCollections
attribute set totrue
, the188BET靠谱吗Zotero.nextCollection()
call will be available.It provides collection objects like those created on import.
while((collection=188BET靠谱吗Zotero.nextCollection())){// Do something}
The function188BET靠谱吗Zotero.nextCollection()
returns a collection object:
{id:"ABCD1234",// Eight-character hexadecimal keychildren:[item,item,..,item],188BET靠谱吗// Array of Zotero item objectsname:"Test Collection"}
The collection ID here is the same thing as the collection key used inAPI calls.
Search Translators
ThedetectSearch
anddoSearch
functions of search translators are passed item objects.On any given inputdetectSearch
should returntrue
orfalse
, as in "COinS.js":
functiondetectSearch(item){if(item.itemType==="journalArticle"||item.DOI){returntrue;}returnfalse;}
doSearch
should augment the provided item with additional information and callitem.complete()
when done.Since search translators are never called directly, but only by other translators or by theAdd Item by Identifier(magic wand) function, it is common for the information to be further processed an''itemDone'' handlerspecified in the calling translator.
Further Reference
Utility Functions
188BET靠谱吗Zotero provides severalutility functionsfor translators to use.Some of them are used for asynchronous and synchronous HTTP requests!those arediscussed above.188BET靠谱吗In addition to those HTTP functions and the many standard functions provided by JavaScript, Zotero provides:
-
188BET靠谱吗Zotero.Utilities.capitalizeTitle(title, ignorePreference)
Applies English-style title case to the string, if the capitalizeTitles hidden preferenceis set.IfignorePreference
is true, title case will be applied even if the preference is set to false.This function is often useful for fixing capitalization of personal names, in conjunction with the built-in string methodtext.toLowerCase()
. -
188BET靠谱吗Zotero.Utilities.cleanAuthor(author, creatorType, hasComma)
Attempts to split the given string into firstName and lastName components, splitting on a comma if desired and performs some clean-up (e.g.removes unnecessary white-spaces and punctuation).The creatorType (see the list of valid creator typesfor each item type) will be just passed trough.Returns a creator object of the form:{ lastName: , firstName: , creatorType: }
, which can for example used directly initem.creators.push()
as argument. -
188BET靠谱吗Zotero.Utilities.getItemArray(doc, node, includeRegex, excludeRegex)
Given the current DOM document, and a node or nodes in that document, returns an associative array of link ⇒ textContent pairs, suitable for passing to188BET靠谱吗Zotero.selectItems(..)
.Allchildren of the specified node with HREF attributes that are matched by includeRegex and/or not matched by excludeRegex are included in the array.
varitems=188BET靠谱吗Zotero.Utilities.getItemArray(doc,doc.getElementById("MainColumn").getElementsByTagName("h1"),'/artikel/.+\\.html');188BET靠谱吗Zotero.selectItems(items,processCallback);
-
188BET靠谱吗Zotero.Utilities.trimInternal(text)
Removes extra internal whitespace from the text and returns it.This is frequently useful for post-processing text extracted using XPath, which frequently has odd internal whitespace. -
188BET靠谱吗Zotero.Utilities.xpath(elements, xpath, [namespaces])
Evaluates the specified XPath on the DOM element or array of DOM elements given, with the optionally specified namespaces.If present, the third argument should be object whose keys represent namespace prefixes, and whose values represent their URIs.Returns an array of matching DOM elements, or null if no match.188BET靠谱吗(Added in Zotero 2.1.9) -
188BET靠谱吗Zotero.Utilities.xpathText(elements, xpath, [namespaces], [delimiter])
Generates a string from the content of nodes matching a given XPath, as in188BET靠谱吗Zotero.Utilities.xpath(..)
.By default, the nodes' content is delimited by commas!a different delimiter symbol or string may be specified.188BET靠谱吗(Added in Zotero 2.1.9) -
188BET靠谱吗Zotero.Utilities.removeDiacritics(str, lowercaseOnly)
Removes diacritics from a string, returning the result.The second argument is an optimization that specifies that only lowercase diacritics should be replaced.188BET靠谱吗(Added in Zotero 3.0) -
188BET靠谱吗Zotero.debug(text)
188BET靠谱吗Prints the specified message to the debug log at zotero://debug.
188BET靠谱吗Zotero.Utilities
can optionally be replaced with the shorthandZU
and188BET靠谱吗
withZ
, as inZU.capitalizeTitle(..)
andZ.debug(..)
.
Function and Object Index
See also theFunction and Object Index, which lists (without documentation), all the functions and objects are accessible to translators.
Calling other translators
Web translators can call other translators to parse metadata provided in a standard format with the help of existing import translators, or to augment incomplete data with the help of search translators.There are several ways of invoking other translators.
Calling a translator by UUID
This is the most common way to use another translator– simply specify the translator type and the UUID of the desired translator.In this case, the RIS translator is being called.
vartranslator=188BET靠谱吗Zotero.loadTranslator("import");translator.setTranslator("32d59d2d-b65a-4da4-b0a3-bdd3cfb979e7");translator.setString(text);translator.translate();
Calling a translator using ''getTranslators''
This code, based on the "COinS.js" code, callsgetTranslators()
to identify which search translators can make a complete item out of the basic template information already present.Note thattranslate()
is called from within the event handler.Analogous logic could be used to get the right import translator for incoming metadata in an unknown format.
varsearch=188BET靠谱吗Zotero.loadTranslator("search");search.setHandler("translators",function(obj,translators){search.setTranslator(translators);search.translate();});search.setSearch(item);// look for translators for given itemsearch.getTranslators();
Using ''getTranslatorObject''
The MARC translator is one of several translators that provide an interface to their internal logic by exposing several objects, listed in theirexports
array.Here, it provides an object that encapsulates the MARC logic.The translator can also take input specified viasetString
that can take binary MARC, but this provides a way for library catalog translators to feed human-readable MARC into the translator.
// Load MARCvartranslator=188BET靠谱吗Zotero.loadTranslator("import");translator.setTranslator("a6ee60df-1ddc-4aae-bb25-45e0537be973");translator.getTranslatorObject(function(obj){varrecord=obj.record();record.leader="leader goes here";record.addField(code,indicator,content);varitem=new188BET靠谱吗Zotero.Item();record.translate(item);item.libraryCatalog=188BET靠谱吗"Zotero.org Library Catalog";item.complete();});
Method overview
-
188BET靠谱吗Zotero.loadTranslator(type)
Type should be one ofimport
orsearch
.Returns an object with the methods below. -
translator.setSearch(item)
For search translators.Sets the skeleton item object the translator will use for its search. -
translator.setString(string)
For import translators.Sets the string that the translator will import from. -
translator.setDocument(document)
For web translators.Sets the document that the translator will use. -
translator.setTranslator(translator)
Takes translator object (returned bygetTranslators(..)
, or the UUID of a translator. -
translator.setHandler(event, callback)
Valid events areitemDone
,done
,translators
,error
.TheitemDone
handler is called on each invocation ofitem.complete()
in the translator, and the specified callback is passed two arguments: the translator object and the item in question. Note:TheitemDone
callback is responsible for callingitem.complete()
on the item it receives, otherwise the item will not be saved to the database. -
translator.getTranslators()
Send atranslators
event to the registered handler (usesetHandler
, above).The handler will be called with, as its second argument, an array of those translators that return a non-false value fordetectImport
,detectSearch
ordetectWeb
when passed the input given withsetString
,setSearch
, etc. -
translator.getTranslatorObject(callback)
The callback is passed an object that has the variables and functions defined in the translator as attributes and methods.In connectors, only theexports
object, if present in the translator, will be passed to the callback.If anexports
object is present, other functions and variables in the translator will not be passed to the callback, even when running in Firefox.
This is typically used when calling import translators that define utility functions, like the MARC and RDF translators.Despite the unfortunate nomenclature, this object is not the same thing as the object returned bygetTranslators(..)
or by188BET靠谱吗Zotero.loadTranslator()
. -
translator.translate()
Runs the translator on the given input.