We’ll improve minigrep
by adding an extra feature: an option for
case-insensitive searching that the user can turn on via an environment
variable. We could make this feature a command line option and require that
users enter it each time they want it to apply, but instead we’ll use an
environment variable. Doing so allows our users to set the environment variable
once and have all their searches be case insensitive in that terminal session.
Writing a Failing Test for the Case-Insensitive search
Function
We want to add a new search_case_insensitive
function that we’ll call when
the environment variable is on. We’ll continue to follow the TDD process, so
the first step is again to write a failing test. We’ll add a new test for the
new search_case_insensitive
function and rename our old test from
one_result
to case_sensitive
to clarify the differences between the two
tests, as shown in Listing 12-20.
Filename: src/lib.rs
#![allow(unused)] fn main() { use std::error::Error; use std::fs; pub struct Config { pub query: String, pub filename: String, } impl Config { pub fn new(args: &[String]) -> Result
{ if args.len() < 3 { return Err("not enough arguments"); } let query = args[1].clone(); let filename = args[2].clone(); Ok(Config { query, filename }) } } pub fn run(config: Config) -> Result<(), Box> { let contents = fs::read_to_string(config.filename)?; for line in search(&config.query, &contents) { println!("{}", line); } Ok(()) } pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> { let mut results = Vec::new(); for line in contents.lines() { if line.contains(query) { results.push(line); } } results } #[cfg(test)] mod tests { use super::*; #[test] fn case_sensitive() { let query: &str = "duct"; let contents: &str = "\ Rust: safe, fast, productive. Pick three. Duct tape."; assert_eq!(vec!["safe, fast, productive."], search(query, contents)); } #[test] fn case_insensitive() { let query: &str = "rUsT"; let contents: &str = "\ Rust: safe, fast, productive. Pick three. Trust me."; assert_eq!( vec!["Rust:", "Trust me."], search_case_insensitive(query, contents) ); } } }
Note that we’ve edited the old test’s contents
too. We’ve added a new line
with the text "Duct tape."
using a capital D that shouldn’t match the query
"duct"
when we’re searching in a case-sensitive manner. Changing the old test
in this way helps ensure that we don’t accidentally break the case-sensitive
search functionality that we’ve already implemented. This test should pass now
and should continue to pass as we work on the case-insensitive search.
The new test for the case-insensitive search uses "rUsT"
as its query. In
the search_case_insensitive
function we’re about to add, the query "rUsT"
should match the line containing "Rust:"
with a capital R and match the line
"Trust me."
even though both have different casing from the query. This is
our failing test, and it will fail to compile because we haven’t yet defined
the search_case_insensitive
function. Feel free to add a skeleton
implementation that always returns an empty vector, similar to the way we did
for the search
function in Listing 12-16 to see the test compile and fail.
Implementing the search_case_insensitive
Function
The search_case_insensitive
function, shown in Listing 12-21, will be almost
the same as the search
function. The only difference is that we’ll lowercase
the query
and each line
so whatever the case of the input arguments,
they’ll be the same case when we check whether the line contains the query.
Filename: src/lib.rs
#![allow(unused)] fn main() { use std::error::Error; use std::fs; pub struct Config { pub query: String, pub filename: String, } impl Config { pub fn new(args: &[String]) -> Result
{ if args.len() < 3 { return Err("not enough arguments"); } let query = args[1].clone(); let filename = args[2].clone(); Ok(Config { query, filename }) } } pub fn run(config: Config) -> Result<(), Box> { let contents = fs::read_to_string(config.filename)?; for line in search(&config.query, &contents) { println!("{}", line); } Ok(()) } pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> { let mut results = Vec::new(); for line in contents.lines() { if line.contains(query) { results.push(line); } } results } pub fn search_case_insensitive<'a>( query: &str, contents: &'a str, ) -> Vec<&'a str> { let query: String = query.to_lowercase(); let mut results: Vec<&str> = Vec::new(); for line: &str in contents.lines() { if line.to_lowercase().contains(&query) { results.push(line); } } results } #[cfg(test)] mod tests { use super::*; #[test] fn case_sensitive() { let query = "duct"; let contents = "\ Rust: safe, fast, productive. Pick three. Duct tape."; assert_eq!(vec!["safe, fast, productive."], search(query, contents)); } #[test] fn case_insensitive() { let query = "rUsT"; let contents = "\ Rust: safe, fast, productive. Pick three. Trust me."; assert_eq!( vec!["Rust:", "Trust me."], search_case_insensitive(query, contents) ); } } }
First, we lowercase the query
string and store it in a shadowed variable with
the same name. Calling to_lowercase
on the query is necessary so no matter
whether the user’s query is "rust"
, "RUST"
, "Rust"
, or "rUsT"
, we’ll
treat the query as if it were "rust"
and be insensitive to the case. While
to_lowercase
will handle basic Unicode, it won't be 100% accurate. If we were
writing a real application, we'd want to do a bit more work here, but this section
is about environment variables, not Unicode, so we'll leave it at that here.
Note that query
is now a String
rather than a string slice, because calling
to_lowercase
creates new data rather than referencing existing data. Say the
query is "rUsT"
, as an example: that string slice doesn’t contain a lowercase
u
or t
for us to use, so we have to allocate a new String
containing
"rust"
. When we pass query
as an argument to the contains
method now, we
need to add an ampersand because the signature of contains
is defined to take
a string slice.
Next, we add a call to to_lowercase
on each line
before we check whether it
contains query
to lowercase all characters. Now that we’ve converted line
and query
to lowercase, we’ll find matches no matter what the case of the
query is.
Let’s see if this implementation passes the tests:
$ cargo test
Compiling minigrep v0.1.0 (file:///projects/minigrep)
Finished test [unoptimized + debuginfo] target(s) in 1.33s
Running unittests (target/debug/deps/minigrep-9cd200e5fac0fc94)
running 2 tests
test tests::case_insensitive ... ok
test tests::case_sensitive ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running unittests (target/debug/deps/minigrep-9cd200e5fac0fc94)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Doc-tests minigrep
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Great! They passed. Now, let’s call the new search_case_insensitive
function
from the run
function. First, we’ll add a configuration option to the
Config
struct to switch between case-sensitive and case-insensitive search.
Adding this field will cause compiler errors because we aren’t initializing
this field anywhere yet:
Filename: src/lib.rs
#![allow(unused)] fn main() { use std::error::Error; use std::fs; pub struct Config { pub query: String, pub filename: String, pub case_sensitive: bool, } impl Config { pub fn new(args: &[String]) -> Result
{ if args.len() < 3 { return Err("not enough arguments"); } let query = args[1].clone(); let filename = args[2].clone(); Ok(Config { query, filename }) } } pub fn run(config: Config) -> Result<(), Box> { let contents = fs::read_to_string(config.filename)?; let results = if config.case_sensitive { search(&config.query, &contents) } else { search_case_insensitive(&config.query, &contents) }; for line in results { println!("{}", line); } Ok(()) } pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> { let mut results = Vec::new(); for line in contents.lines() { if line.contains(query) { results.push(line); } } results } pub fn search_case_insensitive<'a>( query: &str, contents: &'a str, ) -> Vec<&'a str> { let query = query.to_lowercase(); let mut results = Vec::new(); for line in contents.lines() { if line.to_lowercase().contains(&query) { results.push(line); } } results } #[cfg(test)] mod tests { use super::*; #[test] fn case_sensitive() { let query = "duct"; let contents = "\ Rust: safe, fast, productive. Pick three. Duct tape."; assert_eq!(vec!["safe, fast, productive."], search(query, contents)); } #[test] fn case_insensitive() { let query = "rUsT"; let contents = "\ Rust: safe, fast, productive. Pick three. Trust me."; assert_eq!( vec!["Rust:", "Trust me."], search_case_insensitive(query, contents) ); } } }
Note that we added the case_sensitive
field that holds a Boolean. Next, we
need the run
function to check the case_sensitive
field’s value and use
that to decide whether to call the search
function or the
search_case_insensitive
function, as shown in Listing 12-22. Note this still
won’t compile yet.
Filename: src/lib.rs
#![allow(unused)] fn main() { use std::error::Error; use std::fs; pub struct Config { pub query: String, pub filename: String, pub case_sensitive: bool, } impl Config { pub fn new(args: &[String]) -> Result
{ if args.len() < 3 { return Err("not enough arguments"); } let query = args[1].clone(); let filename = args[2].clone(); Ok(Config { query, filename }) } } pub fn run(config: Config) -> Result<(), Box<dyn Error>> { let contents: String = fs::read_to_string(path: config.filename)?; let results: Vec<&str> = if config.case_sensitive { search(&config.query, &contents) } else { search_case_insensitive(&config.query, &contents) }; for line: &str in results { println!("{}", line); } Ok(()) } pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> { let mut results = Vec::new(); for line in contents.lines() { if line.contains(query) { results.push(line); } } results } pub fn search_case_insensitive<'a>( query: &str, contents: &'a str, ) -> Vec<&'a str> { let query = query.to_lowercase(); let mut results = Vec::new(); for line in contents.lines() { if line.to_lowercase().contains(&query) { results.push(line); } } results } #[cfg(test)] mod tests { use super::*; #[test] fn case_sensitive() { let query = "duct"; let contents = "\ Rust: safe, fast, productive. Pick three. Duct tape."; assert_eq!(vec!["safe, fast, productive."], search(query, contents)); } #[test] fn case_insensitive() { let query = "rUsT"; let contents = "\ Rust: safe, fast, productive. Pick three. Trust me."; assert_eq!( vec!["Rust:", "Trust me."], search_case_insensitive(query, contents) ); } } }
Finally, we need to check for the environment variable. The functions for
working with environment variables are in the env
module in the standard
library, so we want to bring that module into scope with a use std::env;
line
at the top of src/lib.rs. Then we’ll use the var
function from the env
module to check for an environment variable named CASE_INSENSITIVE
, as shown
in Listing 12-23.
Filename: src/lib.rs
#![allow(unused)] fn main() { use std::env; // --snip-- use std::error::Error; use std::fs; pub struct Config { pub query: String, pub filename: String, pub case_sensitive: bool, } impl Config { pub fn new(args: &[String]) -> Result<Config, &str> { if args.len() < 3 { return Err("not enough arguments"); } let query: String = args[1].clone(); let filename: String = args[2].clone(); let case_sensitive: bool = env::var(key: "CASE_INSENSITIVE").is_err(); Ok(Config { query, filename, case_sensitive, }) } } pub fn run(config: Config) -> Result<(), Box
> { let contents = fs::read_to_string(config.filename)?; let results = if config.case_sensitive { search(&config.query, &contents) } else { search_case_insensitive(&config.query, &contents) }; for line in results { println!("{}", line); } Ok(()) } pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> { let mut results = Vec::new(); for line in contents.lines() { if line.contains(query) { results.push(line); } } results } pub fn search_case_insensitive<'a>( query: &str, contents: &'a str, ) -> Vec<&'a str> { let query = query.to_lowercase(); let mut results = Vec::new(); for line in contents.lines() { if line.to_lowercase().contains(&query) { results.push(line); } } results } #[cfg(test)] mod tests { use super::*; #[test] fn case_sensitive() { let query = "duct"; let contents = "\ Rust: safe, fast, productive. Pick three. Duct tape."; assert_eq!(vec!["safe, fast, productive."], search(query, contents)); } #[test] fn case_insensitive() { let query = "rUsT"; let contents = "\ Rust: safe, fast, productive. Pick three. Trust me."; assert_eq!( vec!["Rust:", "Trust me."], search_case_insensitive(query, contents) ); } } }
Here, we create a new variable case_sensitive
. To set its value, we call the
env::var
function and pass it the name of the CASE_INSENSITIVE
environment
variable. The env::var
function returns a Result
that will be the successful
Ok
variant that contains the value of the environment variable if the
environment variable is set. It will return the Err
variant if the
environment variable is not set.
We’re using the is_err
method on the Result
to check whether it’s an error
and therefore unset, which means it should do a case-sensitive search. If the
CASE_INSENSITIVE
environment variable is set to anything, is_err
will
return false and the program will perform a case-insensitive search. We don’t
care about the value of the environment variable, just whether it’s set or
unset, so we’re checking is_err
rather than using unwrap
, expect
, or any
of the other methods we’ve seen on Result
.
We pass the value in the case_sensitive
variable to the Config
instance so
the run
function can read that value and decide whether to call search
or
search_case_insensitive
, as we implemented in Listing 12-22.
Let’s give it a try! First, we’ll run our program without the environment
variable set and with the query to
, which should match any line that contains
the word “to” in all lowercase:
$ cargo run to poem.txt
Compiling minigrep v0.1.0 (file:///projects/minigrep)
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
Running `target/debug/minigrep to poem.txt`
Are you nobody, too?
How dreary to be somebody!
Looks like that still works! Now, let’s run the program with CASE_INSENSITIVE
set to 1
but with the same query to
.
If you're using PowerShell, you will need to set the environment variable and run the program as separate commands:
PS> $Env:CASE_INSENSITIVE=1; cargo run to poem.txt
This will make CASE_INSENSITIVE
persist for the remainder of your shell
session. It can be unset with the Remove-Item
cmdlet:
PS> Remove-Item Env:CASE_INSENSITIVE
We should get lines that contain “to” that might have uppercase letters:
$ CASE_INSENSITIVE=1 cargo run to poem.txt
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
Running `target/debug/minigrep to poem.txt`
Are you nobody, too?
How dreary to be somebody!
To tell your name the livelong day
To an admiring bog!
Excellent, we also got lines containing “To”! Our minigrep
program can now do
case-insensitive searching controlled by an environment variable. Now you know
how to manage options set using either command line arguments or environment
variables.
Some programs allow arguments and environment variables for the same configuration. In those cases, the programs decide that one or the other takes precedence. For another exercise on your own, try controlling case insensitivity through either a command line argument or an environment variable. Decide whether the command line argument or the environment variable should take precedence if the program is run with one set to case sensitive and one set to case insensitive.
The std::env
module contains many more useful features for dealing with
environment variables: check out its documentation to see what is available.