Thursday, September 10, 2009

Excluding couchdb-lucene Documents (or trying to)

‹prev | My Chain | next›

Tonight is one of those days that I make a small offering to the gods of my chain. I spent most of the night at Kevin Smith's presentation to the DC Erlang group, limiting the time that I can devote to the chain.

I have moved onto the next Cucumber scenario of the recipe update feature. This scenario describes how recipes that have been updated are suppressed in couchdb-lucene search results.

The scenario currently reads:
   Scenario: Searching for a recipe with an update
Given a "Buttermilk Pancake" recipe with "buttermilk" in it
And a "Buttermilk Pancake" recipe on another day with "lowfat milk" in it
When the "buttermilk" recipe is marked as update of the "lowfat milk" recipe
And I search for "pancake"
Then I should see the "buttermilk" recipe in the results
And I should not see the "lowfat milk" recipe in the results
The first few steps are already defined, though I do need to update the given-recipe steps to ensure that they get indexed—the transform function requires that recipe documents are of type "Recipe" and have been published:
Given /^a "([^\"]*)" recipe (.*)with "([^\"]*)" in it$/ do |title, on, ingredient|
date = (on == "") ? Date.new(2009, 9, 5) : Date.new(2000, 9, 5)
permalink = date.to_s + "-" + title.downcase.gsub(/\W/, '-')

@permalink_identified_by ||= { }
@permalink_identified_by[ingredient] = permalink

recipe = {
:title => title,
:type => 'Recipe',
:published => true,

:date => date,
:preparations => [{'ingredient' => {'name' => ingredient}}]
}

RestClient.put "#{@@db}/#{permalink}",
recipe.to_json,
:content_type => 'application/json'
end
I have found it wise to give couchdb-lucene a half second to index these documents so I add another given step to allow that to happen:
   Scenario: Searching for a recipe with an update
Given a "Buttermilk Pancake" recipe with "buttermilk" in it
And a "Buttermilk Pancake" recipe on another day with "lowfat milk" in it
And a 0.5 second wait to allow the search index to be updated
...
I end up rewriting the text of the then clauses so that I avoid clashing with predefined steps:
    Then I should see the recipe with "buttermilk" in the search results
And I should not see the recipe with "lowfat milk" in the search results
I can then define those two steps as:
Then /^I should see the recipe with "([^\"]*)" in the search results$/ do |ingredient|
response.should have_selector(".ingredients",
:content => ingredient)
end

Then /^I should not see the recipe with "([^\"]*)" in the search results$/ do |ingredient|
response.should_not have_selector(".ingredients",
:content => ingredient)
end
When I run that scenario, it fails because I have yet to exclude the updated recipe from the index:
cstrom@jaynestown:~/repos/eee-code$ cucumber features/recipe_replacement.feature:26
Sinatra::Test is deprecated; use Rack::Test instead.
Feature: Updating recipes in our cookbook

As an author
I want to mark recipes as replacing old one
So that I can record improvements and retain previous attempts for reference

Scenario: Searching for a recipe with an update # features/recipe_replacement.feature:26
Given a "Buttermilk Pancake" recipe with "buttermilk" in it # features/step_definitions/recipe_replacement.rb:1
And a "Buttermilk Pancake" recipe on another day with "lowfat milk" in it # features/step_definitions/recipe_replacement.rb:1
And a 0.5 second wait to allow the search index to be updated # features/step_definitions/recipe_search.rb:212
When the "buttermilk" recipe is marked as update of the "lowfat milk" recipe # features/step_definitions/recipe_replacement.rb:21
And I search for "pancake" # features/step_definitions/recipe_search.rb:216
Then I should see the recipe with "buttermilk" in the search results # features/step_definitions/recipe_replacement.rb:58
And I should not see the recipe with "lowfat milk" in the search results # features/step_definitions/recipe_replacement.rb:63
expected following output to omit a <.ingredients>lowfat milk</.ingredients>:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>EEE Cooks</title>
<link href="/stylesheets/style.css" rel="stylesheet" type="text/css">
</head>
<html><body>
..
<tr class="row0">
<td>
<a href="/recipes/2009-09-05-buttermilk-pancake">Buttermilk Pancake</a>
</td>
<td>
<span class="date">2009-09-05</span>
</td>
<td class="numeric">
<span class="prep">0</span>
</td>
<td>
<span class="ingredients">buttermilk</span>
</td>
</tr>
<tr class="row1">
<td>
<a href="/recipes/2000-09-05-buttermilk-pancake">Buttermilk Pancake</a>
</td>
<td>
<span class="date">2000-09-05</span>
</td>
<td class="numeric">
<span class="prep">0</span>
</td>
<td>
<span class="ingredients">lowfat milk</span>
</td>
</tr>
</table>
<div class="pagination">
<span class="inactive">« Previous</span><span class="current">1</span><span class="inactive">Next »</span>
</div>
<div id="footer"></div>
</body></html>
</html>
(Spec::Expectations::ExpectationNotMetError)
features/recipe_replacement.feature:33:in `And I should not see the recipe with "lowfat milk" in the search results'

Failing Scenarios:
cucumber features/recipe_replacement.feature:26 # Scenario: Searching for a recipe with an update

1 scenario (1 failed)
7 steps (1 failed, 6 passed)
0m1.155s
Unfortunately, I am stuck at this point. The lucene Document object, which is exposed as a Rhino "scriptable" object by couchdb-lucene has no concept of the IndexReader, which could be used to modify existing documents. Thus I have no way to "Update" documents to remove old recipes from the index.

I have no immediate answer for this, so I mark the example as pending for now:
cstrom@jaynestown:~/repos/eee-code$ cucumber features/recipe_replacement.feature:26
Sinatra::Test is deprecated; use Rack::Test instead.
Feature: Updating recipes in our cookbook

As an author
I want to mark recipes as replacing old one
So that I can record improvements and retain previous attempts for reference

Scenario: Searching for a recipe with an update # features/recipe_replacement.feature:26
Given a "Buttermilk Pancake" recipe with "buttermilk" in it # features/step_definitions/recipe_replacement.rb:1
And a "Buttermilk Pancake" recipe on another day with "lowfat milk" in it # features/step_definitions/recipe_replacement.rb:1
And a 0.5 second wait to allow the search index to be updated # features/step_definitions/recipe_search.rb:212
When the "buttermilk" recipe is marked as update of the "lowfat milk" recipe # features/step_definitions/recipe_replacement.rb:21
And I search for "pancake" # features/step_definitions/recipe_search.rb:216
Then I should see the recipe with "buttermilk" in the search results # features/step_definitions/recipe_replacement.rb:58
And I should not see the recipe with "lowfat milk" in the search results # features/step_definitions/recipe_replacement.rb:63
figure out how to exclude documents, based on other documents (Cucumber::Pending)
features/recipe_replacement.feature:33:in `And I should not see the recipe with "lowfat milk" in the search results'

1 scenario (1 pending)
7 steps (1 pending, 6 passed)
0m1.817s
This may just be exhaustion on my part. I will likely play with this again tomorrow before moving onto another feature.

No comments:

Post a Comment