It looks like it's pulling characters from the paragraph to generate the "unique" paragraph ID. ID = First letter from the first 3 words in the first sentence in the paragraph + First letter from the first 3 words in the last sentence in the paragraph.
I wonder... for all the different articles on NYTimes, and the different configurations of words across paragraphs, is this unique enough such that you won't get duplicate paragraph IDs in any given article?
It only has to be unique within the article, since it's added to the article path, and there would likely be some kind of provision to add or swap out for a unique character in case of conflict. It's also case-preserving, so that implies likely case-sensitivity as well. I guess we'll have to find an instance of two - probably single-sentence - paragraphs with the same characters and same capitalization in the same story to be certain.
I wonder... for all the different articles on NYTimes, and the different configurations of words across paragraphs, is this unique enough such that you won't get duplicate paragraph IDs in any given article?