TODO: Kategorien einführen wie es ticket Systeme bieten?

Inhaltsverzeichnis

1 Installer
- 1.1 VM Ware Image
2 Clustering
3 Link Farm Crawling
4 Feedback to rate search-result quality
5 More from this page
6 External Blacklists
7 RDF support

Installer

VM Ware Image

linux vmware-player image (security++, simplicity++)

Clustering

To improve the search results (witch are not in a pretty order most time) it might be useful to introduce clustering, as you can see on Clusty.com -- MovGP0 16:29, 4. Mai 2006 (CEST)

Link Farm Crawling

To fight Linkfarms full of Spam it might be senseful to crawl such pages with a link-deep of about 1-2 and collect all of the liked domains and grewlist them. The user should get a Webpage with links to that pages afterwards, so the user can control the result if the user is not sure if a specific link is really spam. Afterwards, the user can decide to remove some domains from this list. The rest will get added to the blacklist. If a domain occurs about 1 + floor(sqrt(NumberOfPeers)) times in a blacklist, the site might get blocked within the whole YaCy-Network -- MovGP0 16:29, 4. Mai 2006 (CEST)

- Blocked in the whole net is not possible. We have no control, what a peerowner does. But we can send a News, which could be a hint for other peerowners from our peer. But if its more than one Pagemoderation per day, its to much to do for other peerowners ...

Feedback to rate search-result quality

At the end of a page with search results, I would be happy to give "you" a feedback. So that I can say, if YaCy was finding my page or my information and perhaps where I finally found my information or which page is not yet part of our index. I think this could be a good way to improve the quality of YaCy... --GoogleFan 14:51, 2. Jun 2006 (CEST)
- There is no "You", YaCy is decentral. Your Peer can give feedback to other peers of course.--Allo 15:31, 4. Jun 2006 (CEST)
  - What about Seeks? --Ktplulo 19:06, 10. Mär. 2012 (CET)

More from this page

Just show a few results per domain and a link/button "more from this site" so if I try to find information about a company/site (e.g. microsoft) the results aren't flooded with results from their site. Helpfull if I do some research and don't want to get all the marketing crap.--Neo@NHNG 14:58, 15. Feb 2008 (CET)

External Blacklists

Would be very useful to inlude external URL blacklist lookup feature to the crawler. Uribl and Surbl are probably the most well-known blacklists.
- http://www.uribl.com/
- http://www.surbl.org/

--Ott 13:27, 26. Jan 2009 (CET)

RDF support

RDF-Storage based on the Jena Framework.
If the crawler finds an RDF-File (whitch means .rdf, .owl, and .foaf Files) or RDF-Markup within a xHTML-File, the Content of this RDF should get copied into a distributed Jena-based Semantic Storage (afaik Jena is not mind to support distributed computing/querying, so you might need to develop you're own storage). Also it should be possible to make global SPARQL-Queries on this Storage. There is also the need for a timeout, so that Semantic Queries won't take to many resources.
This is also interesting when wanting to offer RSS 1.0 and RSS 1.1 support.
Notice, that I think that this whish is a realistic goal for version 3.x. RSS 0.9, RSS 0.91, and RSS 0.92 should not get supported, because there are not compatible with RDF.
-- MovGP0 15:39, 4. Mai 2006 (CEST)

Dev:Wishlist

Inhaltsverzeichnis

Installer

VM Ware Image

Clustering

Link Farm Crawling

Feedback to rate search-result quality

More from this page

External Blacklists

RDF support

Navigationsmenü

Meine Werkzeuge

Namensräume

Varianten

Ansichten

Mehr

Suche

Gemeinschaftsportal

Navigation

Werkzeuge