{"id":607,"date":"2009-10-27T07:30:47","date_gmt":"2009-10-27T06:30:47","guid":{"rendered":"http:\/\/blogs.wittwer.fr\/whiler\/?p=607"},"modified":"2012-03-02T03:40:46","modified_gmt":"2012-03-02T02:40:46","slug":"blog-recherches-lemmatisees","status":"publish","type":"post","link":"https:\/\/blogs.wittwer.fr\/whiler\/2009\/10\/27\/blog-recherches-lemmatisees\/","title":{"rendered":"Recherches lemmatis\u00e9es&#8230; tentative !"},"content":{"rendered":"<p>WordPress int\u00e8gre un champ de recherche pour n&rsquo;afficher que les articles souhait\u00e9s&#8230;<\/p>\n<p>Je me suis habitu\u00e9 \u00e0 de luxueuses recherches avec <a href=\"https:\/\/fr.sinequa.com\/produit.aspx\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"\/whiler\/wp-content\/uploads\/2009\/10\/sinequa.png\" alt=\"Sinequa\" title=\"Visiter le site de Sinequa\" width=\"75\" height=\"12\" class=\"size-full wp-image-649\" \/><\/a>, qui permet entre autres, d&rsquo;effectuer des recherches o\u00f9 les mots peuvent \u00eatre <a href=\"http:\/\/fr.wikipedia.org\/wiki\/Lemmatisation\" rel=\"glossary\" target=\"_blank\" title=\"Wikipedia, D&eacute;finition de&nbsp;: lemmatis\u00e9s\" style=\"\" >lemmatis\u00e9s<\/a><sup style=\"font-family: Georgia, Times New Roman, Serif; font-weight: bold; color: #AAAAAA\" ><em>W<\/em><\/sup>&nbsp;:<\/p>\n<ul>\n<li>La forme canonique des mots (lemme) est utilis\u00e9e \u00e0 l&rsquo;indexation, puis \u00e0 la recherche.<\/li>\n<li>Par exemple, si un document contient \u00ab\u00a0chevaux\u00a0\u00bb, on pourra rechercher \u00ab\u00a0cheval\u00a0\u00bb et trouver ce document&#8230;<br \/>\nUne recherche SQL classique ( <code class=\"codecolorer sql dawn\"><span class=\"sql\"><span class=\"kw1\">LIKE<\/span> <span class=\"st0\">'%cheval%'<\/span><\/span><\/code> ) ne ram\u00e8ne pas ce document&#8230;<\/li>\n<\/ul>\n<p>J&rsquo;ai alors effectu\u00e9 quelques recherches sur la toile afin de trouver une extension pour WordPress qui apporterait une solution&nbsp;:<!--more--><\/p>\n<ul>\n<li><a href=\"https:\/\/wordpress.org\/extend\/plugins\/wpsearch\/\" target=\"_blank\">wpSearch<\/a> est une solution qui r\u00e9pondrait presque \u00e0 mon attente&#8230;\n<ul>\n<li>il int\u00e8gre <a href=\"http:\/\/fr.wikipedia.org\/wiki\/Lucene\" rel=\"glossary\" target=\"_blank\" title=\"Wikipedia, D&eacute;finition de&nbsp;: Lucene\" style=\"\" >Lucene<\/a><sup style=\"font-family: Georgia, Times New Roman, Serif; font-weight: bold; color: #AAAAAA\" ><em>W<\/em><\/sup>&#8230;<\/li>\n<li>sauf que c&rsquo;est uniquement pr\u00e9vu pour du texte en anglais&#8230;\n<ul>\n<li>et mon blog est en fran\u00e7ais&#8230; <img src=\"https:\/\/blogs.wittwer.fr\/whiler\/wp-includes\/images\/smilies\/skype\/\/tmi.gif\" alt=\"(tmi)\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li>J&rsquo;ai ensuite trouv\u00e9 un <em><a href=\"https:\/\/www.phenixapp-project.net\/repositories\/browse\/phenixapp\/trunk\/PhenixApp\/libs\/Phenix\/Search\/Lucene\/Analysis\/Analyzer\/Standard?rev=67\" target=\"_blank\">lemmatiseur<\/a><\/em> fran\u00e7ais pour la version <a href=\"http:\/\/fr.wikipedia.org\/wiki\/PHP\" rel=\"glossary\" target=\"_blank\" title=\"Wikipedia, D&eacute;finition de&nbsp;: PHP\" style=\"\" >PHP<\/a><sup style=\"font-family: Georgia, Times New Roman, Serif; font-weight: bold; color: #AAAAAA\" ><em>W<\/em><\/sup> de Lucene.\n<ul>\n<li>J&rsquo;ai modifi\u00e9 l&rsquo;extension wpSearch afin de lemmatiser en fran\u00e7ais avec les fichiers de <a href=\"https:\/\/www.phenixapp-project.net\/repositories\/browse\/phenixapp\/trunk\/PhenixApp\/libs\/Phenix\/Search\/Lucene\/Analysis\/TokenFilter\/FrenchStemmer\" target=\"_blank\">PhenixApp<\/a>&#8230;\n<ul>\n<li>Mais mes tests de recherche n&rsquo;ont pas donn\u00e9 de r\u00e9sultats plus pertinents&#8230; <img src=\"https:\/\/blogs.wittwer.fr\/whiler\/wp-includes\/images\/smilies\/skype\/\/sweat.gif\" alt=\"(:|\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Pour le moment, j&rsquo;ai donc laiss\u00e9 le moteur de recherche standard de WordPress&#8230; <img src=\"https:\/\/blogs.wittwer.fr\/whiler\/wp-includes\/images\/smilies\/skype\/\/worry.gif\" alt=\":s\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> mais je n&rsquo;ai certainement pas dit mon dernier mot <img src=\"https:\/\/blogs.wittwer.fr\/whiler\/wp-includes\/images\/smilies\/skype\/\/cool.gif\" alt=\"8-)\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <\/p>\n<div class=\"thanks_button_div\" \n                  style=\"float: right; margin-right: 10px; margin-top:10px;\"><div id=\"thanksButtonDiv_607_1\" style=\"background-image:url(https:\/\/blogs.wittwer.fr\/whiler\/wp-content\/plugins\/thanks-you-counter-button\/images\/thanks_compact_brown1.png); background-repeat:no-repeat; float: left; display: inline;\"\n                onmouseover=\"javascript:thankYouChangeButtonImage('thanksButtonDiv_607_1', true);\" \n                onmouseout=\"javascript:thankYouChangeButtonImage('thanksButtonDiv_607_1', false);\"\n                onclick=\"javascript:thankYouChangeButtonImage('thanksButtonDiv_607_1', false);\" >\n                <input type=\"button\" onclick=\"thankYouButtonClick(607, 'You left &ldquo;Thanks&rdquo; already for this post')\" value=\"Merci\u00a0 1\"\n                  class=\"thanks_button thanks_compact thanks_brown1\"\n                  style=\"  font-family: Verdana, Arial, Sans-Serif; font-size: 14px; font-weight: normal;; color:#00f;\"\n                  id=\"thanksButton_607_1\" title=\"Click to leave &ldquo;Thanks&rdquo; for this post\"\/>\n             <\/div><div id=\"ajax_loader_607_1\" style=\"display:inline;visibility: hidden;\"><img decoding=\"async\" alt=\"ajax loader\" src=\"https:\/\/blogs.wittwer.fr\/whiler\/wp-content\/plugins\/thanks-you-counter-button\/images\/ajax-loader.gif\" \/><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>WordPress int\u00e8gre un champ de recherche pour n\u2019afficher que des articles correspondant\u2026<\/p>\n<p>Je me suis habitu\u00e9 \u00e0 de luxueuses recherches avec Sinequa, qui permet entre autres, d\u2019effectuer des recherches o\u00f9 les mots peuvent \u00eatre lemmatis\u00e9s&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[9],"tags":[99,106,43,100,155,107],"class_list":["post-607","post","type-post","status-publish","format-standard","hentry","category-php","tag-addons","tag-blog","tag-donnees","tag-extensions","tag-sql","tag-wordpress"],"_links":{"self":[{"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/posts\/607","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/comments?post=607"}],"version-history":[{"count":0,"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/posts\/607\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/media?parent=607"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/categories?post=607"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.wittwer.fr\/whiler\/wp-json\/wp\/v2\/tags?post=607"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}