Strings: Working with Text

Duration: 45 min

Strings: Working with Text

Duration: 45 min

Introduction

Strings are everywhere in programming: usernames, error messages, file paths, API responses, and application data. Java treats strings specially—they're immutable objects with a rich API for manipulation. Understanding strings deeply improves code quality and prevents common bugs.

This module covers String methods, concatenation, immutability, StringBuilder for efficiency, text formatting, basic regex patterns, and string comparisons. You'll learn when immutability matters, how to handle performance issues, and patterns for real-world text processing.

String Basics and Immutability

Strings are immutable—once created, they cannot change. Operations like substring or replace return new String objects, leaving the original untouched.

public class StringBasics {
    public static void main(String[] args) {
        // Creating strings
        String name = "Alice";
        String repeated = new String("Alice");  // Unnecessary - almost never do this
        
        // Immutability demonstration
        String original = "Hello";
        String modified = original.toUpperCase();
        System.out.println(original);  // Still "Hello"
        System.out.println(modified);  // "HELLO"
        
        // String literals are interned (cached for efficiency)
        String s1 = "Hello";
        String s2 = "Hello";
        System.out.println(s1 == s2);  // true - same reference (don't rely on this!)
        
        // String created with new is separate
        String s3 = new String("Hello");
        System.out.println(s1 == s3);  // false - different objects
        System.out.println(s1.equals(s3));  // true - same content (use this!)
        
        // Concatenation creates new String
        String greeting = "Hello";
        greeting = greeting + " " + "World";  // Creates new String object
        System.out.println(greeting);
        
        // Character access
        String word = "Java";
        System.out.println(word.length());  // 4
        System.out.println(word.charAt(0));  // 'J'
        System.out.println(word.charAt(3));  // 'a'
    }
}

Always use .equals() for string comparison, never ==. The == operator checks object reference, while .equals() compares content. This distinction is critical in production code.

Essential String Methods

String class has over 50 methods. Here are the most important:

public class StringMethods {
    public static void main(String[] args) {
        String text = "Hello World Java";
        
        // Length and character access
        System.out.println("Length: " + text.length());
        System.out.println("First char: " + text.charAt(0));
        
        // Substrings
        System.out.println(text.substring(0, 5));  // "Hello" (index 0-4)
        System.out.println(text.substring(6));  // "World Java" (from index 6)
        
        // Case conversion
        System.out.println(text.toUpperCase());  // "HELLO WORLD JAVA"
        System.out.println(text.toLowerCase());  // "hello world java"
        
        // Finding substrings
        System.out.println(text.indexOf("World"));  // 6
        System.out.println(text.indexOf("x"));  // -1 (not found)
        System.out.println(text.contains("Java"));  // true
        System.out.println(text.startsWith("Hello"));  // true
        System.out.println(text.endsWith("Java"));  // true
        
        // Trimming whitespace
        String dirty = "  spaces  ";
        System.out.println("|" + dirty.trim() + "|");  // "|spaces|"
        
        // Replacing
        System.out.println(text.replace("World", "Universe"));
        
        // Splitting into array
        String csv = "apple,banana,cherry";
        String[] fruits = csv.split(",");
        for (String fruit : fruits) {
            System.out.println(fruit);
        }
        
        // Joining array into string
        String[] words = {"The", "quick", "brown", "fox"};
        String joined = String.join(" ", words);
        System.out.println(joined);
    }
}

indexOf returns -1 when not found. substring's first parameter is inclusive, second is exclusive. split() uses regex—escape special characters.

String Comparison

Comparing strings requires understanding value vs reference equality.

public class StringComparison {
    public static void main(String[] args) {
        String a = "Java";
        String b = "Java";
        String c = new String("Java");
        
        // equals() - content comparison (what you usually want)
        System.out.println(a.equals(b));  // true
        System.out.println(a.equals(c));  // true
        
        // equalsIgnoreCase() - case-insensitive
        System.out.println(a.equalsIgnoreCase("java"));  // true
        
        // == - reference comparison (don't use for strings)
        System.out.println(a == b);  // true (accidental - both are literals)
        System.out.println(a == c);  // false (different objects)
        
        // compareTo() - lexicographic ordering
        System.out.println("apple".compareTo("banana"));  // -1 (less)
        System.out.println("zebra".compareTo("apple"));  // 19 (greater)
        System.out.println("java".compareTo("java"));  // 0 (equal)
        
        // isEmpty() and isBlank()
        String empty = "";
        String blank = "   ";
        System.out.println(empty.isEmpty());  // true
        System.out.println(blank.isEmpty());  // false
        System.out.println(blank.isBlank());  // true (Java 11+)
    }
}

For sorting or ordering, use compareTo(). For equality checks, use equals() or equalsIgnoreCase(). Never use == for content comparison.

StringBuilder for Efficiency

StringBuilder is mutable and efficient for repeated concatenations. Use it instead of + in loops.

public class StringBuilderDemo {
    public static void main(String[] args) {
        // Inefficient: concatenation in loop
        long start = System.currentTimeMillis();
        String result = "";
        for (int i = 0; i < 1000; i++) {
            result += i + ", ";  // Creates new String each iteration!
        }
        long end = System.currentTimeMillis();
        System.out.println("Concatenation time: " + (end - start) + "ms");
        
        // Efficient: StringBuilder
        start = System.currentTimeMillis();
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 1000; i++) {
            sb.append(i).append(", ");  // Mutates single object
        }
        String result2 = sb.toString();
        end = System.currentTimeMillis();
        System.out.println("StringBuilder time: " + (end - start) + "ms");
        
        // StringBuilder methods
        StringBuilder text = new StringBuilder("Hello");
        text.append(" ");
        text.append("World");
        System.out.println(text);  // "Hello World"
        
        text.insert(5, "Beautiful ");  // Insert at index 5
        System.out.println(text);  // "HelloBeautiful World"
        
        text.replace(0, 5, "Hi");  // Replace indices 0-4
        System.out.println(text);  // "HiBeautiful World"
        
        text.reverse();
        System.out.println(text);  // Reversed
        
        System.out.println(text.length());  // Number of characters
        System.out.println(text.capacity());  // Internal buffer size
    }
}

StringBuilder is mutable and doesn't create new objects. For single operations, String concatenation is fine. For loops or complex building, always use StringBuilder.

String Formatting

Format strings for readable output without manual concatenation.

public class StringFormatting {
    public static void main(String[] args) {
        // printf-style formatting
        System.out.printf("Name: %s, Age: %d%n", "Alice", 25);
        
        String formatted = String.format("Price: $%.2f", 19.5);
        System.out.println(formatted);  // "Price: $19.50"
        
        // Common format specifiers
        System.out.printf("%s%n", "String");  // %s = string
        System.out.printf("%d%n", 42);  // %d = integer
        System.out.printf("%f%n", 3.14);  // %f = float (default 6 decimals)
        System.out.printf("%.2f%n", 3.14159);  // %.2f = 2 decimal places
        System.out.printf("%x%n", 255);  // %x = hexadecimal
        System.out.printf("%05d%n", 42);  // %05d = zero-padded to 5 digits
        System.out.printf("%,d%n", 1000000);  // %,d = thousands separator
        
        // Table formatting
        System.out.printf("%-10s %5d %8.2f%n", "Item", "Qty", "Price");
        System.out.printf("%-10s %5d %8.2f%n", "Apple", 5, 2.50);
        System.out.printf("%-10s %5d %8.2f%n", "Banana", 12, 0.75);
        
        // String.join() for lists
        String csv = String.join(",", "Alice", "Bob", "Charlie");
        System.out.println(csv);  // "Alice,Bob,Charlie"
    }
}

%n is cross-platform newline (works on all OSes). Use String.format() for reusable strings, System.out.printf() for direct output.

Regular Expressions Basics

Regex patterns for matching and manipulating text.

public class RegexBasics {
    public static void main(String[] args) {
        String email = "user@example.com";
        
        // Simple pattern matching
        System.out.println(email.matches(".@.\\.com"));  // true
        
        // Replace using regex
        String text = "The year 2024 has 365 days";
        String noNumbers = text.replaceAll("\\d", "X");  // \d = digit
        System.out.println(noNumbers);  // "The year XXXX has XXX days"
        
        // Split by pattern
        String csv = "apple, banana , cherry";
        String[] fruits = csv.split("\\s,\\s");  // Split by comma with optional spaces
        for (String fruit : fruits) {
            System.out.println("|" + fruit + "|");  // Clean output
        }
        
        // Common patterns
        String phone = "123-456-7890";
        System.out.println(phone.matches("\\d{3}-\\d{3}-\\d{4}"));  // true
        
        // Email pattern (simplified)
        String user = "john.doe@company.co.uk";
        System.out.println(user.matches("[a-zA-Z0-9.]+@[a-zA-Z0-9.]+"));  // true
        
        // Finding patterns
        String data = "ID:001, ID:002, ID:003";
        String withoutIds = data.replaceAll("ID:\\d+", "[REDACTED]");
        System.out.println(withoutIds);  // "[REDACTED], [REDACTED], [REDACTED]"
    }
}

Common regex patterns: \d (digit), \w (word char), \s (whitespace), . (any char), * (0+), + (1+), ? (0-1), {n} (exactly n), [abc] (a, b, or c).

Common String Patterns

Pattern 1: Validating Input

String input = "  john  ";
if (!input.trim().isEmpty() && input.trim().length() >= 3) {
    System.out.println("Valid name");
}

Pattern 2: Parsing CSV-like Data

String line = "Alice,30,Engineer";
String[] parts = line.split(",");
String name = parts[0];
int age = Integer.parseInt(parts[1]);

Pattern 3: Building Query Strings

StringBuilder query = new StringBuilder("SELECT * FROM users WHERE ");
query.append("age > ").append(18);
query.append(" AND status = 'active'");

Advanced String Techniques

Working with Character Arrays

public class CharacterProcessing {
    public static void analyzeString(String text) {
        char[] chars = text.toCharArray();
        
        int vowels = 0;
        int consonants = 0;
        int digits = 0;
        
        for (char c : chars) {
            if ("aeiouAEIOU".indexOf(c) >= 0) {
                vowels++;
            } else if (Character.isLetter(c)) {
                consonants++;
            } else if (Character.isDigit(c)) {
                digits++;
            }
        }
        
        System.out.printf("Vowels: %d, Consonants: %d, Digits: %d%n", 
                         vowels, consonants, digits);
    }
    
    public static String reverseString(String text) {
        char[] chars = text.toCharArray();
        int left = 0, right = chars.length - 1;
        
        while (left < right) {
            char temp = chars[left];
            chars[left] = chars[right];
            chars[right] = temp;
            left++;
            right--;
        }
        
        return new String(chars);
    }
}

Advanced String Methods

public class StringAdvanced {
    public static void main(String[] args) {
        String text = "Hello, World! Java is awesome.";
        
        // Case-insensitive comparison
        System.out.println("hello".equalsIgnoreCase("HELLO"));  // true
        
        // Comparing substrings
        System.out.println(text.regionMatches(0, "Hello", 0, 5));  // true
        
        // Finding patterns
        System.out.println(text.matches(".Java."));  // true
        
        // Replacing with regex
        String processed = text.replaceAll("[^a-zA-Z ]", "");
        System.out.println(processed);  // "Hello World Java is awesome"
        
        // Character classification
        System.out.println(Character.isUpperCase('H'));  // true
        System.out.println(Character.isDigit('5'));  // true
        System.out.println(Character.isWhitespace(' '));  // true
        
        // String methods
        System.out.println(text.lastIndexOf("a"));  // Last occurrence
        System.out.println(text.replaceFirst("\\s", "_"));  // Replace first space
        System.out.println(text.concat(" More text"));
    }
}

String Immutability Deep Dive

public class StringImmutability {
    public static void demonstrateImmutability() {
        String original = "Hello";
        
        // These operations don't modify original
        String upper = original.toUpperCase();  // "HELLO"
        String sub = original.substring(0, 3);  // "Hel"
        String concat = original + " World";  // "Hello World"
        
        System.out.println(original);  // Still "Hello"
        
        // This creates many intermediate String objects
        String result = "A";
        for (int i = 0; i < 1000; i++) {
            result = result + "B";  // Creates new String each time!
        }
        // Very inefficient - creates 1000 intermediate String objects
        
        // Much better with StringBuilder
        StringBuilder sb = new StringBuilder("A");
        for (int i = 0; i < 1000; i++) {
            sb.append("B");  // Modifies same object
        }
        String resultBuilt = sb.toString();  // One final String
    }
}

Regular Expressions in Depth

import java.util.regex.*;

public class RegexAdvanced { public static void main(String[] args) { // Email validation (simplified) String email = "user@example.com"; String emailPattern = "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}"; System.out.println(email.matches(emailPattern)); // true // Phone number String phone = "555-123-4567"; String phonePattern = "\\d{3}-\\d{3}-\\d{4}"; System.out.println(phone.matches(phonePattern)); // true // Word extraction String text = "Hello 123 World 456 Java"; Pattern pattern = Pattern.compile("\\w+"); Matcher matcher = pattern.matcher(text); while (matcher.find()) { System.out.println(matcher.group()); // Each word } // Finding all digits Pattern digits = Pattern.compile("\\d+"); Matcher digitMatcher = digits.matcher(text); while (digitMatcher.find()) { System.out.println(digitMatcher.group()); // "123", "456" } } }

Performance Comparison: String Operations

public class StringPerformance {
    public static void main(String[] args) {
        // Test 1: Concatenation performance
        long start = System.nanoTime();
        String result = "";
        for (int i = 0; i < 100; i++) {
            result += i;  // Creates new String each iteration
        }
        long time1 = System.nanoTime() - start;
        System.out.println("Concatenation time: " + time1 + "ns");
        
        // Test 2: StringBuilder performance
        start = System.nanoTime();
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 100; i++) {
            sb.append(i);
        }
        String resultSb = sb.toString();
        long time2 = System.nanoTime() - start;
        System.out.println("StringBuilder time: " + time2 + "ns");
        
        System.out.println("StringBuilder is " + (time1 / time2) + "x faster");
    }
}

Common String Tasks

public class StringTasks {
    // Reverse words
    public static String reverseWords(String text) {
        String[] words = text.split(" ");
        StringBuilder result = new StringBuilder();
        for (int i = words.length - 1; i >= 0; i--) {
            result.append(words[i]);
            if (i > 0) result.append(" ");
        }
        return result.toString();
    }
    
    // Remove duplicates
    public static String removeDuplicateChars(String text) {
        StringBuilder result = new StringBuilder();
        for (char c : text.toCharArray()) {
            if (result.indexOf(String.valueOf(c)) < 0) {
                result.append(c);
            }
        }
        return result.toString();
    }
    
    // Count word frequency
    public static java.util.Map wordFrequency(String text) {
        java.util.Map freq = new java.util.HashMap<>();
        String[] words = text.toLowerCase().split("\\W+");
        for (String word : words) {
            if (!word.isEmpty()) {
                freq.put(word, freq.getOrDefault(word, 0) + 1);
            }
        }
        return freq;
    }
}

Key Takeaways

1. Strings are immutable—operations return new String objects 2. Always use .equals() for content comparison, never == 3. Use StringBuilder for efficient string concatenation in loops 4. String formatting with % provides readable output 5. Basic regex patterns enable powerful text matching and manipulation 6. Character arrays offer fine-grained string processing 7. Methods like indexOf(), substring(), split() are workhorse methods 8. StringBuilder is mutable and should be used for complex building 9. Regular expressions support patterns for validation and extraction 10. Performance matters: concatenation vs StringBuilder can differ by orders of magnitude

Quiz

Question 1: What's the output?

String s1 = "Hello";
String s2 = s1;
s1 = s1 + " World";
System.out.println(s1);
System.out.println(s2);
  • A) "Hello World" and "Hello World"
  • B) "Hello World" and "Hello" ✓
  • C) "Hello" and "Hello World"
  • D) Compilation error

Question 2: Which comparison is correct?

String a = "Java";
String b = new String("Java");
  • A) a == b returns true
  • B) a.equals(b) returns true ✓
  • C) Both A and B are correct
  • D) Neither works in Java

Question 3: What's the most efficient way to build a string in a loop?

  • A) String concatenation with +
  • B) StringBuilder ✓
  • C) String.format()
  • D) Regex patterns

Question 4: What does this print?

String text = "Java Programming";
System.out.println(text.substring(5, 9));
  • A) "Java"
  • B) "Prog"
  • C) "Prog" ✓
  • D) "Progr"

Question 5: Which regex pattern matches three digits?

  • A) \d\d\d
  • B) \d{3}
  • C) \d+
  • D) Both A and B are correct ✓

Question 6: What's the output?

String s = "  Hello  ";
System.out.println(s.trim().length());
  • A) 9
  • B) 7
  • C) 5 ✓
  • D) 3