Return to Unfiction unforum
 a.r.g.b.b 
FAQ FAQ   Search Search 
 
Welcome!
New users, PLEASE read these forum guidelines. New posters, SEARCH before posting and read these rules before posting your killer new campaign. New players may also wish to peruse the ARG Player Tutorial.

All users must abide by the Terms of Service.
Website Restoration Project
This archiving project is a collaboration between Unfiction and Sean Stacey (SpaceBass), Brian Enigma (BrianEnigma), and Laura E. Hall (lehall) with
the Center for Immersive Arts.
Announcements
This is a static snapshot of the
Unfiction forums, as of
July 23, 2017.
This site is intended as an archive to chronicle the history of Alternate Reality Games.
 
The time now is Mon Nov 11, 2024 9:38 pm
All times are UTC - 4 (DST in action)
View posts in this forum since last visit
View unanswered posts in this forum
Calendar
 Forum index » Meta » Online Tools
Text Comparison Tool
Moderators: imbri
View previous topicView next topic
Page 1 of 1 [10 Posts]  
Author Message
BrumGuy
Boot

Joined: 03 Mar 2007
Posts: 66

Text Comparison Tool

Hi All,


I've had a search through and cannot see anything like this mentioned anywhere else, but do appologise if I have missed it.

I'm looking for some sort of tool, that will scan through large bodies of text, (web pages included), and highlight or list phrases which are common to both/all.

I'm not even certain that these exist outside of Universities, that I've heard now use them to catch out students that have decided to take a bit of a short-cut on their assignments, but thought I'd still ask the question here to see if there were any available.


Thanks in advance,


Brum

PostPosted: Sun May 13, 2007 11:06 am
 View user's profile
 Back to top 
thebruce
Dances With Wikis


Joined: 16 Aug 2004
Posts: 6899
Location: Kitchener, Ontario

a quick google should bring up a number of shareware/freeware software... but I currently have Beyond Compare installed on my machine (www.scootersoftware.com), for example.
_________________
@4DFiction/@Wikibruce/Contact
ARGFest 2013 - Seattle! ARGFest.com


PostPosted: Sun May 13, 2007 12:42 pm
 View user's profile Visit poster's website AIM Address
 Back to top 
catherwood
I Have 100 Cats and Smell of Wee

Joined: 25 Sep 2002
Posts: 4109
Location: Silicon Valley, CA

The programs I use highlight *differences* rather than the text which is the same in two files. I use WinDiff myself, but some people prefer FileSynch for that.

(edit to add) WinDiff can also compare two directories, telling you which files differ between directory listings, as well as which files having the same name are different.

PostPosted: Sun May 13, 2007 12:55 pm
Last edited by catherwood on Mon May 14, 2007 12:41 am; edited 1 time in total
 View user's profile AIM Address Yahoo Messenger
 Back to top 
BrumGuy
Boot

Joined: 03 Mar 2007
Posts: 66

Hi both - and thanks for the quick replies.


I've had a quick peep at "Beyond Compare", or more accurately the website detailing what it can do, and it seems to be much the same as FileSynch and WinDiff, so not sure that it's what i am looking for??

Essentially, I want something that will search through selected website pages, (the more the merrier), and highlight strings of text which are common to at least two of them. Basically, a common phrase spotter, that I do not need to know what the phrase is prior to starting the comparison.

Any and all help, is and will be appreciated, as my own google trawling hasn't turned up anything useful, but then again I do not even know if my search terms are correct!!


Thanks for the help so far, and thebruce have I misunderstood about "Beyond Compare"??


Brum

PostPosted: Sun May 13, 2007 9:55 pm
 View user's profile
 Back to top 
thebruce
Dances With Wikis


Joined: 16 Aug 2004
Posts: 6899
Location: Kitchener, Ontario

BrumGuy wrote:
Essentially, I want something that will search through selected website pages, (the more the merrier), and highlight strings of text which are common to at least two of them. Basically, a common phrase spotter, that I do not need to know what the phrase is prior to starting the comparison.

hm. Well in that case I haven't come across any apps that do that specifically, but I haven't searched for that ability. Even so, you could emulate that by using, for instance, WinHTTrack to download html from a websites, then run the text comparison tool.

Doing a remote comparison of two websites seems like a big step when essentially all it will be doing is downloading the urls in question, then comparing them locally (whether saved as files or not). *shrug*

It probably wouldn't be hard to extend a file comparison utility to include comparing urls... you could visit some open source projects and suggest a feature such as that, if no software currently exists. But as I mentioned, the workaround exists to download the pages to compare, then run the test locally.

Hope that's a little more helpful Wink
_________________
@4DFiction/@Wikibruce/Contact
ARGFest 2013 - Seattle! ARGFest.com


PostPosted: Sun May 13, 2007 11:31 pm
 View user's profile Visit poster's website AIM Address
 Back to top 
thebruce
Dances With Wikis


Joined: 16 Aug 2004
Posts: 6899
Location: Kitchener, Ontario

double post: just did a quick google and came a possible contender - oxygenxml editor (based on this ability, including url comparison)

I haven't checked it out, but it looks like it might accomplish what you're looking for
_________________
@4DFiction/@Wikibruce/Contact
ARGFest 2013 - Seattle! ARGFest.com


PostPosted: Sun May 13, 2007 11:37 pm
 View user's profile Visit poster's website AIM Address
 Back to top 
Rogi Ocnorb
I Have 100 Cats and Smell of Wee


Joined: 01 Sep 2005
Posts: 4266
Location: Where the cheese is free.

The only type of tool I have encountered along these lines are duplicate file checkers for system cleanup activities. However, those always compare files at the file level. A tool such as the one you seek might be found by including what would be a necessary aspect of such searches in its basic operation... The ability to compare "paragraphs", "sentences", "phrases" and/or "Minimum number of concurrent words" as if it compared files only at the word level and only used single words in its searches it would be a pretty useless tool.
_________________
I'm telling you now, so you can't say, "Oh, I didn't know...Nobody told me!"


PostPosted: Mon May 14, 2007 12:23 am
 View user's profile AIM Address Yahoo Messenger MSN Messenger
 Back to top 
catherwood
I Have 100 Cats and Smell of Wee

Joined: 25 Sep 2002
Posts: 4109
Location: Silicon Valley, CA

i returned to add to my post above, but then saw all of the replies. If you don't know in advance what terms two files (or webpages) might have in common, how will you judge the results? Any two random webpages are likely to share many words in common: the, click, menu, home, copyright, etc. -- maybe you just want to compare meta keywords instead?

PostPosted: Mon May 14, 2007 12:45 am
 View user's profile AIM Address Yahoo Messenger
 Back to top 
realityshifter
Boot

Joined: 03 Jul 2007
Posts: 29

I'm not sure if this tool will help, but there is a site called Copyscape that I commonly use to make sure nobody else is copying the content from my own web site, and I think it might work in a roundabout way for the purpose you described.

If you know the URL of one of the pages, you can type the URL into the search field at copyscape.com and it will return a list of search results with other sites/pages throughout the net that have matching text. It will rank the search results based on which ones have the most matching text when compared to the URL you entered, and it will highlight the matching parts too, which makes it very handy. So, if you enter a URL and there are phrases or entire sentences or paragraphs on that page that also show up somewhere on another site, the other site will turn up in the search results and you can click on the result to see a view of the other site with those matching phrases/sentences highlighted.

There are two downsides: First, you'd have to know the URL of at least one of the pages in order for the Copyscape site to be useful for you. Second, you can only perform a certain number of free searches per month on any given domain name. I think the limit is ten free URL searches per domain per month.

PostPosted: Thu Jul 05, 2007 5:12 pm
 View user's profile
 Back to top 
redct
Entrenched


Joined: 20 Jun 2007
Posts: 1233

WinDiff is a great tool. If you're on a Mac/Linux/Unix/etc you can just use diff. It supports a wide range of options (ignore case, ignore spacing, ignore tabs and a whole bunch of other things)

PostPosted: Mon Aug 06, 2007 1:27 pm
 View user's profile Visit poster's website AIM Address
 Back to top 
Display posts from previous:   Sort by:   
Page 1 of 1 [10 Posts]  
View previous topicView next topic
 Forum index » Meta » Online Tools
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum
You cannot post calendar events in this forum



Powered by phpBB © 2001, 2005 phpBB Group