//fuck这个破鼠标,写了半天,还没报错,页面就被跳没了!!
重写:
首先,下载相关软件
tomcat http://apache.mirror.phpchina.com/tomcat/tomcat-4/v4.1.37/bin/apache-tomcat-4.1.37.tar.gz
nutch http://apache.mirror.phpchina.com/lucene/nutch/nutch-0.7.2.tar.gz
jdk的配置,看这篇文章 http://www.sunboyu.cn/2008/09/20/centos5%E4%B8%8B%E9%85%8D%E7%BD%AEjdk%E7%8E%AF%E5%A2%83.shtml
顺便写了个脚本
- # author:sunboyu@gmail.com
- # qq:176300676 msn:sunboyu@gmail.com
- # http://www.sunboyu.cn
- #!/bin/sh
- export JAVA_HOME=/opt/jdk1.6.0
- export CLASSPATH=.:/opt/jdk1.6.0/lib/tools.jar:/opt/jdk1.6.0/lib/dt.jar:/opt/jdk1.6.0
- export PATH=$PATH:/opt/jdk1.6.0/bin
- export JRE_HOME=/opt/jdk1.6.0
- export CATALINA_BASE=/opt/tomcat
- export CATALINA_HOME=/opt/tomcat
- export CATALINA_TMPDIR=/opt/tomcat/temp
把nutch目录下的nutch-0.7.2.war配置为tomcat下的默认站点
修改tomcat下webapps/ROOT/WEB-INF/classes/nutch-site.xml
增加以下配置
- <property>
- <name>searcher.dir</name>
- <value>/local/nutch/crawl</value>
- </property>
启动tomcat!
运行以下nutch命令
bin/nutch crawl urls -dir /test -depth 5 -topN 1000 -threads 5
则可以在/test目录中创建抓取的索引。
然后在tomcat服务中测试下搜索效果!