🌲 Data Structures ⚙️ Algorithms 💻 Operating System 🌐 Computer Networks 🗄️ DBMS 🧠 Machine Learning 🔧 Compiler Design 🟨 JavaScript 📊 SQL

DSAIntermediate

Data Structures and Algorithms: Full Fundamentals + Interview Prep

Q: Should I start DSA with coding or theory?

Start with short theory, then immediately solve coding problems for the same topic.

Q: Which language is best for DSA interviews?

Use the language you can write fastest and explain clearly, commonly Java, C++, Python, or JavaScript.

Q: How much DSA is enough for placements?

Strong fundamentals + regular practice on arrays, strings, trees, graphs, DP, and interview simulations.

✍️ WohoTech Team📅 Last Updated: 2026-03-11⏱️ 35 min

What is DSA and Why Does It Matter?

Data Structures and Algorithms — commonly called DSA — is the foundation of computer science and software engineering. Every major product you use daily, from Google Search to Instagram's feed ranking to Uber's routing engine, runs on well-designed data structures and efficient algorithms underneath.

At its core, DSA answers two critical questions:

How should we store data? — so it can be accessed, updated, and searched as fast as possible.
How should we process data? — so we solve problems with the least time and memory possible.

Without DSA knowledge, you can write code that works correctly on small inputs but completely collapses at scale. With DSA, you write programs that remain fast whether they handle 100 records or 100 million.

DSA Fundamentals

Before diving into individual structures, you need to understand the four pillars that everything else is built on:

1. Logic Building and Problem Decomposition This is the ability to read a complex problem, break it into smaller solvable pieces, and connect those pieces into a full solution. It is a skill you build through practice, not memorization.

2. Time and Space Complexity Analysis Every solution has a cost. Time complexity measures how the runtime grows as input size grows. Space complexity measures how much extra memory is consumed. We express both using Big-O notation — for example, O(n) means the work grows linearly with input size.

3. Data Organization and Access Patterns Different problems need different storage strategies. Arrays give fast index-based access. Hash maps give fast key-based lookup. Trees give fast hierarchical search. Choosing wrong here makes everything else harder.

4. Selecting the Right Structure for the Right Problem A hammer is not the right tool for a screw. Similarly, a linked list is not the right tool when you need random access by index. Understanding when to use what is the mark of a strong engineer.

Logic Building Problems

Logic building is the first skill every programmer must develop. These are small problems that train your brain to think step by step — using conditions, loops, mathematical observations, and pattern recognition.

Before you can solve "find the shortest path in a graph," you need to be comfortable with "is this number even or odd?" That comfort comes from drilling logic problems daily.

Why Logic Building Matters

When you face a new problem in an interview, you don't recall a solution from memory — you construct one. That construction ability is pure logic. Engineers who skip this step struggle to think on their feet during live interviews.

Coding Example

# Check if a number is even or odd using the modulus operator.
# The % operator gives the remainder after division.
# If n % 2 == 0, there is no remainder, so the number is divisible by 2 — it's even.
# If n % 2 == 1, there is a remainder of 1 — the number is odd.

n = 7
if n % 2 == 0:
    print("Even")
else:
    print("Odd")

# Output: Odd

Analysis of Algorithms — Big-O Notation

When we write two different solutions to the same problem, how do we decide which one is better? We use complexity analysis.

Time Complexity

Time complexity does not measure actual seconds — it measures how the number of operations scales with input size n. We drop constants and lower-order terms because they become irrelevant at large scale.

| Notation | Name | Example | |---|---|---| | O(1) | Constant | Array index access | | O(log n) | Logarithmic | Binary search | | O(n) | Linear | Single loop | | O(n log n) | Linearithmic | Merge sort | | O(n²) | Quadratic | Nested loops | | O(2ⁿ) | Exponential | Recursive subsets |

Space Complexity

Space complexity measures extra memory your algorithm uses beyond the input itself. An algorithm that creates a new array of size n has O(n) space complexity. One that only uses a few variables has O(1) space complexity.

Coding Example

# This single loop runs exactly n times.
# For each of the n elements, we do one print operation.
# Total operations = n → Time Complexity = O(n)
# We use no extra memory beyond the loop variable → Space Complexity = O(1)

for i in range(n):
    print(i)

Core Data Structures

Array

An array is the simplest and most widely used data structure. It stores a fixed-size collection of elements of the same type in contiguous (side-by-side) memory locations.

Because elements sit next to each other in memory, accessing any element by its index takes constant time — O(1). You tell the computer "give me element at position 3," and it instantly calculates the memory address and returns the value.

When to use arrays:

When you know the size of the data in advance.
When you need fast random access by index.
When memory locality matters (arrays are cache-friendly).

Limitations:

Insertion and deletion in the middle is expensive — O(n) — because elements must be shifted.
Fixed size in many languages (dynamic arrays like Python lists handle resizing internally).

Coding Example

# Find the maximum value in an array manually (without using max()).
# We start by assuming the first element is the maximum.
# Then we walk through every element.
# If we find something larger, we update our maximum.
# After the loop, max_val holds the largest element.

arr = [3, 7, 2, 9, 5]
max_val = arr[0]          # assume first element is max

for x in arr:
    if x > max_val:
        max_val = x       # update when we find something bigger

print(max_val)            # Output: 9

String

A string is a sequence of characters. In most languages, strings are stored internally as arrays of characters, which means all array operations apply — but strings often come with additional built-in methods for searching, splitting, replacing, and transforming text.

Strings appear everywhere in real applications — parsing user input, searching logs, processing DNA sequences, building compilers. Mastering string manipulation is essential for both interviews and production work.

Key string operations to know:

Reversal, palindrome check
Substring search (contains, indexOf)
Anagram detection using frequency maps
Two-pointer techniques on strings

Coding Example

# Python strings support slicing with [start:stop:step].
# s[::-1] means: start from end, go to beginning, step -1 (backwards).
# This is the most Pythonic way to reverse a string — O(n) time, O(n) space.

s = "hello"
print(s[::-1])    # Output: "olleh"

# Palindrome check — a string that reads the same forwards and backwards
s2 = "racecar"
print(s2 == s2[::-1])   # Output: True

Hashing

Hashing is one of the most powerful ideas in computer science. A hash function takes any key (a string, number, object) and converts it into an integer index that points to a location in an internal array called a hash table.

This gives us average-case O(1) insertion, deletion, and lookup — far faster than searching through an array (O(n)) or a sorted list (O(log n)).

How it works internally:

You call hashmap["name"] = "Alice".
The hash function computes hash("name") = 47 (some integer).
The value "Alice" is stored at index 47 in the internal array.
When you later call hashmap["name"], it recomputes the hash, goes to index 47, and returns "Alice" instantly.

Collision: Two different keys can produce the same hash index. This is handled by chaining (each slot holds a linked list) or open addressing. Worst case with many collisions is O(n), but good hash functions make this rare.

Coding Example

# Count how many times each number appears in the array.
# freq is a dictionary — Python's built-in hash map.
# freq.get(x, 0) returns the current count of x, or 0 if x was never seen.
# We then add 1 to that count.

arr = [1, 2, 2, 3, 1]
freq = {}

for x in arr:
    freq[x] = freq.get(x, 0) + 1

print(freq)
# Output: {1: 2, 2: 2, 3: 1}
# Meaning: 1 appears 2 times, 2 appears 2 times, 3 appears 1 time.

Linked List

A linked list is a dynamic data structure where each element (called a node) stores two things: the actual data value, and a pointer/reference to the next node in the sequence.

Unlike arrays, linked list nodes are not stored in contiguous memory. They can be scattered anywhere in memory and are connected only through pointers. This makes insertion and deletion at any position very efficient — O(1) if you already have the reference to that node — because you just redirect pointers instead of shifting elements.

The tradeoff: You lose random access. To reach the 5th element, you must traverse from the head, one node at a time — O(n).

# Node is the building block of all linked list types.
# data holds the value.
# next holds a reference to the next node (None if this is the last node).

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None   # By default, a new node points to nothing

Singly Linked List

Each node points only forward to the next node. Traversal is one-directional — head to tail. Memory efficient, but you cannot go backwards.

Singly Linked List

Doubly Linked List

Each node holds two pointers — one to the next node and one to the previous node. This allows traversal in both directions. Used in browsers (back/forward navigation), text editors (undo/redo), and LRU cache implementations.

Doubly Linked List

Circular Linked List

The last node's next pointer points back to the head instead of None, forming a loop. Used in round-robin scheduling (operating systems), multiplayer game turn management, and music players with repeat mode.

Circular Linked List

Stack

A Stack follows the LIFO principle — Last In, First Out. Think of a stack of plates: you always add to the top and remove from the top. The plate you placed last is the first one you pick up.

Core operations:

push(x) — add element to the top — O(1)
pop() — remove and return top element — O(1)
peek() — view top element without removing — O(1)

Real-world applications:

Function call stack in every programming language
Undo/redo in editors
Browser back button history
Expression evaluation (parenthesis matching, postfix evaluation)
DFS graph traversal (when done iteratively)

Coding Example

# Python lists work perfectly as stacks.
# append() adds to the end (top of the stack) — O(1)
# pop() removes from the end (top of the stack) — O(1)

stack = []
stack.append(10)   # Stack: [10]
stack.append(20)   # Stack: [10, 20]
stack.append(30)   # Stack: [10, 20, 30]

stack.pop()        # Removes 30 (last in, first out)
print(stack)       # Output: [10, 20]

Queue

A Queue follows the FIFO principle — First In, First Out. Think of a queue at a ticket counter: the person who joined first gets served first.

Core operations:

enqueue(x) — add to the rear — O(1)
dequeue() — remove from the front — O(1)

Real-world applications:

CPU task scheduling
Printer job queue
BFS graph traversal
Message queues in distributed systems (Kafka, RabbitMQ)
Call center systems

Coding Example

# Use collections.deque for queues in Python.
# list.pop(0) works but is O(n) — it shifts all elements.
# deque.popleft() is O(1) — purpose-built for queue operations.

from collections import deque

queue = deque()
queue.append(10)     # Queue: [10]  (enqueue at rear)
queue.append(20)     # Queue: [10, 20]
queue.append(30)     # Queue: [10, 20, 30]

queue.popleft()      # Removes 10 (first in, first out)
print(queue)         # Output: deque([20, 30])

Tree

A tree is a hierarchical, non-linear data structure. Unlike arrays and linked lists which are linear (one element after another), trees branch out — one node can connect to multiple nodes below it.

Key terminology:

Root — the topmost node with no parent
Parent / Child — a node that connects to nodes below (children) or above (parent)
Leaf — a node with no children
Height — longest path from root to any leaf
Depth — distance of a node from the root

Trees are used everywhere: file systems (folders inside folders), HTML DOM, organization charts, compilers (abstract syntax trees), and database indexing.

# A basic tree node has a value and references to left and right children.
# left and right are None when a node has no children (leaf node).

class TreeNode:
    def __init__(self, val):
        self.val = val
        self.left = None    # left child
        self.right = None   # right child

Binary Tree

A Binary Tree is a tree where each node has at most 2 children — conventionally called left and right. This constraint enables efficient algorithms for search, insertion, and traversal.

Binary Tree

Binary Search Tree (BST)

A BST is a binary tree with a strict ordering property:

All values in the left subtree are less than the node's value.
All values in the right subtree are greater than the node's value.
This rule applies recursively to every node in the tree.

This ordering makes search extremely efficient. To find a value, start at root — if the target is smaller, go left; if larger, go right. You eliminate half the remaining nodes at each step, giving O(log n) average search time.

Important caveat: If the BST becomes unbalanced (all nodes inserted in sorted order creates a straight line), performance degrades to O(n). This is why self-balancing trees like AVL trees and Red-Black trees exist.

Binary Search Tree

Graph

A Graph is the most versatile data structure — it models relationships between entities. A graph consists of:

Vertices (nodes) — the entities (people, cities, web pages, servers)
Edges — the connections between them (friendships, roads, hyperlinks, cables)

Types of graphs:

Directed — edges have direction (Twitter follows: A follows B doesn't mean B follows A)
Undirected — edges go both ways (Facebook friends: if A is B's friend, B is A's friend)
Weighted — edges carry a cost/distance (road maps with distances)
Unweighted — all edges are equal

Storage methods:

Adjacency List — each node stores a list of its neighbors. Best for sparse graphs. O(V + E) space.
Adjacency Matrix — a 2D grid where matrix[i][j] = 1 means edge from i to j. Good for dense graphs. O(V²) space.

# Adjacency list representation using a dictionary.
# Each key is a node, and its value is a list of directly connected neighbors.

graph = {
    0: [1, 2],    # Node 0 connects to nodes 1 and 2
    1: [0, 3],    # Node 1 connects to nodes 0 and 3
    2: [0],       # Node 2 connects to node 0 only
    3: [1]        # Node 3 connects to node 1 only
}

Directed vs Undirected Graph (Visual)

Undirected vs Directed Graph

Trie (Prefix Tree)

A Trie (pronounced "try") is a specialized tree designed specifically for storing and searching strings. Instead of storing whole strings at nodes, a trie stores one character per node. Words are formed by following paths from the root to a terminal node marked is_end = True.

Why Tries are powerful:

Search for a word of length m takes O(m) — independent of how many words are stored.
Prefix search (autocomplete, spell check) is natural — just traverse to the prefix node and explore everything below.

Real-world use:

Search engine autocomplete ("goo" → "google", "good", "goose")
Spell checkers
IP routing (longest prefix matching)
T9 keyboard prediction

# TrieNode stores a dictionary of children (one key per character)
# and a flag is_end marking whether a complete word ends here.

class TrieNode:
    def __init__(self):
        self.children = {}    # e.g., {'a': TrieNode, 'b': TrieNode}
        self.is_end = False   # True when a valid word ends at this node

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        node = self.root
        for char in word:
            # If this character doesn't exist as a child, create it
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]   # move down to that child
        node.is_end = True               # mark end of this word

    def search(self, word):
        node = self.root
        for char in word:
            if char not in node.children:
                return False             # character path doesn't exist
            node = node.children[char]
        return node.is_end               # only True if word was inserted

Algorithms

Searching Algorithms

Searching means finding the position of a target value within a data structure. The two fundamental approaches are:

Linear Search: Check every element one by one from the beginning. Works on any array (sorted or unsorted). Time: O(n). Simple and reliable but slow on large data.

Binary Search: Only works on a sorted array. Compare the target with the middle element. If target is smaller, search the left half; if larger, search the right half. Repeat until found. Each step halves the search space, giving O(log n) time.

Why Binary Search is so fast: An array of 1 billion sorted elements requires at most ~30 comparisons with binary search. Linear search would need up to 1 billion comparisons.

def linear_search(arr, key):
    # Walk through every element — O(n)
    for i in range(len(arr)):
        if arr[i] == key:
            return i       # found at index i
    return -1              # not found

def binary_search(arr, key):
    # REQUIREMENT: arr must be sorted before calling this.
    l, r = 0, len(arr) - 1

    while l <= r:
        mid = (l + r) // 2      # find the middle index

        if arr[mid] == key:
            return mid           # found!
        elif arr[mid] < key:
            l = mid + 1          # target is in the right half
        else:
            r = mid - 1          # target is in the left half

    return -1                    # not found

Sorting Algorithms

Sorting arranges elements in a defined order (ascending or descending). Sorted data is much faster to search, merge, and process. The choice of sorting algorithm depends on data size, whether data is nearly sorted, and memory constraints.

Bubble Sort: Compare adjacent pairs and swap if out of order. Repeat until no swaps are needed. Very simple to understand but slow — O(n²) time. Only practical for small arrays or teaching purposes.

Merge Sort: Divide the array in half recursively until each part has 1 element, then merge them back in sorted order. O(n log n) time, O(n) space. Stable and reliable.

Quick Sort: Choose a pivot, partition elements around it (smaller left, larger right), then recursively sort both sides. Average O(n log n), worst case O(n²) with bad pivot choices.

# Bubble Sort implementation — educational example
# Outer loop runs n times (n passes through the array)
# Inner loop compares adjacent pairs and bubbles the largest to the end
# After each pass, the last i elements are in their correct position

arr = [5, 3, 2, 4]

for i in range(len(arr)):
    for j in range(len(arr) - 1):
        if arr[j] > arr[j + 1]:
            # Swap adjacent elements if they are in wrong order
            arr[j], arr[j + 1] = arr[j + 1], arr[j]

print(arr)    # Output: [2, 3, 4, 5]

Recursion

Recursion is a technique where a function calls itself to solve a smaller version of the same problem. Every recursive solution has two mandatory components:

Base Case — the condition that stops the recursion. Without this, you get infinite recursion and a stack overflow.
Recursive Case — the call where the function calls itself with a smaller/simpler input, progressing toward the base case.

Mental model: Think of recursion as trust. You trust that your function correctly solves the problem for n-1, and you use that result to solve for n.

Recursion is the foundation of divide-and-conquer algorithms, tree traversals, backtracking, and dynamic programming.

# Factorial: n! = n × (n-1) × (n-2) × ... × 1
# Base case: 0! = 1 (by mathematical definition)
# Recursive case: n! = n × (n-1)!

def factorial(n):
    if n == 0:
        return 1                    # BASE CASE — stop here
    return n * factorial(n - 1)    # RECURSIVE CASE — trust the smaller result

# Execution trace for factorial(4):
# factorial(4) = 4 × factorial(3)
#              = 4 × 3 × factorial(2)
#              = 4 × 3 × 2 × factorial(1)
#              = 4 × 3 × 2 × 1 × factorial(0)
#              = 4 × 3 × 2 × 1 × 1 = 24

print(factorial(5))    # Output: 120

Greedy Algorithms

A Greedy algorithm makes the locally optimal choice at each step, hoping that these local choices lead to a globally optimal solution. It never reconsiders past choices — it just always picks the best option available right now.

When greedy works: The problem must have the greedy choice property — making the locally best choice at each step guarantees a globally optimal result. Not every problem has this property, so greedy must be proven correct, not just assumed.

Classic greedy problems:

Coin change (with standard denominations)
Activity selection (maximum non-overlapping events)
Huffman encoding (optimal prefix-free compression)
Kruskal's / Prim's algorithm for Minimum Spanning Tree

# Coin Change — Greedy approach
# Given coin denominations [10, 5, 2, 1] and amount 27,
# find the minimum number of coins needed.

# Strategy: always use the largest coin that fits.
# This works here because denominations are "nice" (each divides the next).
# Note: greedy coin change FAILS for arbitrary denominations — use DP instead.

coins = [10, 5, 2, 1]
amount = 27
count = 0

for coin in coins:
    count += amount // coin    # use as many of this coin as possible
    amount %= coin             # remaining amount after using this coin

# Trace:
# coin=10: use 2 coins (20), remaining=7
# coin=5:  use 1 coin  (5),  remaining=2
# coin=2:  use 1 coin  (2),  remaining=0
# coin=1:  use 0 coins,      remaining=0
# Total: 4 coins

print(count)    # Output: 4

Dynamic Programming (DP)

Dynamic Programming solves problems by breaking them into overlapping subproblems, solving each subproblem once, and storing the result to avoid recomputation. This is the key difference from naive recursion — DP never solves the same subproblem twice.

Two hallmarks of a DP problem:

Optimal Substructure — the optimal solution to the whole problem can be built from optimal solutions to subproblems.
Overlapping Subproblems — the same subproblems are solved repeatedly without DP (which is wasteful).

Two DP approaches:

Top-Down (Memoization) — start with the full problem, recurse, and cache results. Natural to write, uses recursion.
Bottom-Up (Tabulation) — start from the smallest subproblems, fill a table iteratively. More space-efficient, avoids recursion overhead.

# Fibonacci sequence using Bottom-Up DP (Tabulation)
# F(0) = 0, F(1) = 1, F(n) = F(n-1) + F(n-2)

# Naive recursion computes F(n) in O(2^n) — exponential (recalculates everything)
# DP tabulation computes F(n) in O(n) — each value computed exactly once

dp = [0, 1]                           # base cases: F(0)=0, F(1)=1

for i in range(2, 10):
    dp.append(dp[i - 1] + dp[i - 2]) # each value built from two previous values

print(dp)
# Output: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Bitwise Operations

Bitwise operations work directly on the binary representation of numbers. They are extremely fast (single CPU instruction) and appear in many optimization tricks, cryptography, embedded systems, and interview problems.

| Operator | Symbol | Example | Result | |---|---|---|---| | AND | & | 5 & 3 = 101 & 011 | 001 = 1 | | OR | \| | 5 \| 3 = 101 \| 011 | 111 = 7 | | XOR | ^ | 5 ^ 3 = 101 ^ 011 | 110 = 6 | | NOT | ~ | ~5 | -6 | | Left Shift | << | 5 << 1 | 10 (multiply by 2) | | Right Shift | >> | 5 >> 1 | 2 (divide by 2) |

# Check if a number is odd using bitwise AND.
# In binary, all odd numbers end in 1, all even numbers end in 0.
# n & 1 checks the last bit:
#   If last bit is 1 → odd
#   If last bit is 0 → even
# This is faster than n % 2 at the hardware level.

n = 5              # binary: 101
if n & 1:          # 101 & 001 = 001 (truthy)
    print("Odd")
else:
    print("Even")

# Output: Odd

# Other useful bit tricks:
# Count set bits (1s) in a number:
count = bin(n).count('1')

# Check if n is a power of 2:
is_power_of_two = (n > 0) and (n & (n - 1)) == 0

Graph Traversal — DFS vs BFS

Graph traversal means visiting every vertex in a graph systematically. There are two fundamental strategies, and understanding when to use each is critical.

DFS vs BFS Traversal Pattern

Depth-First Search (DFS)

DFS explores as deep as possible along one branch before backtracking. It uses a stack (either explicit or the recursion call stack) to remember where to go next.

DFS explores: root → go deep on left → backtrack → go deep on right → backtrack.

When to use DFS:

Detecting cycles in a graph
Topological sorting (task dependency ordering)
Finding connected components
Solving mazes and puzzles (backtracking)
Tree traversals (inorder, preorder, postorder)

def dfs(graph, node, visited=None):
    if visited is None:
        visited = set()

    visited.add(node)
    print(node, end=" ")              # process this node

    for neighbor in graph[node]:
        if neighbor not in visited:   # only visit unvisited neighbors
            dfs(graph, neighbor, visited)

# Time Complexity:  O(V + E)  — visits each vertex and edge once
# Space Complexity: O(V)      — recursion stack depth, visited set

Breadth-First Search (BFS)

BFS explores all neighbors at the current level before going deeper. It uses a queue to process nodes in the order they were discovered.

BFS explores: root → all level-1 neighbors → all level-2 neighbors → and so on.

When to use BFS:

Shortest path in an unweighted graph (BFS guarantees minimum hops)
Level-order tree traversal
Finding all nodes within a certain distance
Web crawling (explore pages by link distance from start)
Social network "degrees of separation"

from collections import deque

def bfs(graph, start):
    visited = set([start])
    queue = deque([start])

    while queue:
        node = queue.popleft()        # process the oldest discovered node
        print(node, end=" ")

        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)  # schedule neighbor for future processing

# Time Complexity:  O(V + E)  — visits each vertex and edge once
# Space Complexity: O(V)      — queue and visited set

Key difference: DFS goes deep (stack/recursion), BFS goes wide (queue). DFS might find A path; BFS finds THE shortest path.

Advanced DSA Topics — Java

Segment Tree

A Segment Tree is a binary tree used for efficiently solving range query problems — problems where you need to repeatedly query or update a contiguous range of an array.

What it enables:

Range Sum Query: "What is the sum of elements from index 3 to index 7?"
Range Minimum/Maximum Query: "What is the minimum value between index 2 and index 9?"
Point Update: "Update element at index 5 to a new value."

All three operations in O(log n) time, compared to O(n) for naive approaches.

How it works: The root stores a value for the entire array. Each node stores the answer for its range. Left child covers the left half, right child covers the right half. Querying decomposes your range into O(log n) segments already stored in the tree.

class SegmentTree {
    int[] tree;    // internal tree array — size 4*n is safe upper bound
    int n;

    SegmentTree(int[] arr) {
        n = arr.length;
        tree = new int[4 * n];
        build(arr, 0, 0, n - 1);    // build from root (node 0, range 0 to n-1)
    }

    // Build the tree bottom-up.
    // node: current tree node index
    // start, end: the range of original array this node covers
    void build(int[] arr, int node, int start, int end) {
        if (start == end) {
            // Leaf node — covers a single element
            tree[node] = arr[start];
        } else {
            int mid = (start + end) / 2;
            // Build left child (covers start to mid)
            build(arr, 2 * node + 1, start, mid);
            // Build right child (covers mid+1 to end)
            build(arr, 2 * node + 2, mid + 1, end);
            // Internal node stores the sum of its two children
            tree[node] = tree[2 * node + 1] + tree[2 * node + 2];
        }
    }

    // Query sum for range [l, r]
    int query(int node, int start, int end, int l, int r) {
        if (r < start || end < l) return 0;          // completely outside range
        if (l <= start && end <= r) return tree[node]; // completely inside range
        int mid = (start + end) / 2;
        int leftSum  = query(2 * node + 1, start, mid, l, r);
        int rightSum = query(2 * node + 2, mid + 1, end, l, r);
        return leftSum + rightSum;
    }
}

Fenwick Tree (Binary Indexed Tree)

A Fenwick Tree (also called Binary Indexed Tree or BIT) solves the same prefix-sum problems as a Segment Tree, but with a simpler implementation and lower constant factors.

Core insight: Using the bitwise property i & (-i) (isolates the lowest set bit of i), each index in the BIT array is responsible for a specific range of elements. Updates and prefix queries both traverse a chain of O(log n) indices.

When to prefer Fenwick over Segment Tree:

Only need prefix sums / point updates (not arbitrary range queries)
Want shorter, simpler code
Slightly better cache performance and memory usage

class FenwickTree {
    int[] bit;   // BIT array (1-indexed, so size n+1)
    int n;

    FenwickTree(int n) {
        this.n = n;
        bit = new int[n + 1];    // index 0 is unused
    }

    // Add delta to index i and propagate to all affected ranges
    // i & (-i) gives the lowest set bit — determines how far to jump
    void update(int i, int delta) {
        while (i <= n) {
            bit[i] += delta;
            i += i & (-i);        // move to the next responsible index
        }
    }

    // Return prefix sum from index 1 to i
    int query(int i) {
        int sum = 0;
        while (i > 0) {
            sum += bit[i];
            i -= i & (-i);        // move to the parent range
        }
        return sum;
    }

    // Range sum from index l to r
    int rangeQuery(int l, int r) {
        return query(r) - query(l - 1);
    }
}

Dijkstra's Shortest Path Algorithm

Dijkstra's algorithm finds the shortest path from a single source vertex to all other vertices in a weighted graph with non-negative edge weights.

Core idea: Maintain a priority queue (min-heap) of (distance, node) pairs. Always process the unvisited node with the smallest known distance. When you process a node, update distances to its neighbors if the path through that node is shorter.

Why non-negative weights only? If you relax a node and later discover a shorter path (via a negative edge), Dijkstra's assumption breaks. Use Bellman-Ford for graphs with negative edges.

Time complexity: O((V + E) log V) using a binary min-heap.

import java.util.*;

class Dijkstra {
    // Pair represents an edge destination and its weight
    static class Pair {
        int node, dist;
        Pair(int n, int d) { node = n; dist = d; }
    }

    static int[] dijkstra(List<Pair>[] graph, int src) {
        int V = graph.length;
        int[] dist = new int[V];
        Arrays.fill(dist, Integer.MAX_VALUE);   // all distances start at infinity
        dist[src] = 0;                           // source to itself is 0

        // Min-heap ordered by distance
        PriorityQueue<Pair> pq = new PriorityQueue<>(
            Comparator.comparingInt(a -> a.dist));
        pq.add(new Pair(src, 0));

        while (!pq.isEmpty()) {
            Pair curr = pq.poll();
            int u = curr.node;

            for (Pair neighbor : graph[u]) {
                int v = neighbor.node;
                int weight = neighbor.dist;

                // If we found a shorter path to v through u, update it
                if (dist[u] + weight < dist[v]) {
                    dist[v] = dist[u] + weight;
                    pq.add(new Pair(v, dist[v]));
                }
            }
        }
        return dist;    // dist[i] = shortest distance from src to node i
    }
}

Advanced DSA Topics — JavaScript

Segment Tree (JavaScript)

class SegmentTree {
  constructor(arr) {
    this.n = arr.length;
    // 4 * n is a safe allocation size for the segment tree array
    this.tree = new Array(4 * this.n).fill(0);
    this.build(arr, 0, 0, this.n - 1);
  }

  // Recursively build the tree
  // node: current index in this.tree
  // start, end: range of original array covered by this node
  build(arr, node, start, end) {
    if (start === end) {
      // Leaf node — stores a single array element
      this.tree[node] = arr[start];
      return;
    }
    const mid = Math.floor((start + end) / 2);
    this.build(arr, 2 * node + 1, start, mid);        // build left child
    this.build(arr, 2 * node + 2, mid + 1, end);      // build right child
    // Parent node = sum of both children
    this.tree[node] = this.tree[2 * node + 1] + this.tree[2 * node + 2];
  }

  // Query sum for range [l, r]
  query(node, start, end, l, r) {
    if (r < start || end < l) return 0;               // out of range
    if (l <= start && end <= r) return this.tree[node]; // fully in range
    const mid = Math.floor((start + end) / 2);
    return (
      this.query(2 * node + 1, start, mid, l, r) +
      this.query(2 * node + 2, mid + 1, end, l, r)
    );
  }
}

Fenwick Tree (JavaScript)

class FenwickTree {
  constructor(n) {
    this.n = n;
    // 1-indexed: index 0 is intentionally unused
    this.bit = new Array(n + 1).fill(0);
  }

  // Update index i by adding delta
  // i & -i isolates the lowest set bit — determines which ranges to update
  update(i, delta) {
    while (i <= this.n) {
      this.bit[i] += delta;
      i += i & -i;           // jump to the next affected range
    }
  }

  // Return prefix sum from index 1 to i
  query(i) {
    let sum = 0;
    while (i > 0) {
      sum += this.bit[i];
      i -= i & -i;           // move to parent range
    }
    return sum;
  }

  // Range sum from l to r (inclusive)
  rangeQuery(l, r) {
    return this.query(r) - this.query(l - 1);
  }
}

0/1 Knapsack — Dynamic Programming (JavaScript)

The 0/1 Knapsack problem is a classic DP problem: given n items each with a weight and a value, and a knapsack of capacity W, find the maximum total value you can carry. Each item can either be taken (1) or left (0) — you cannot take a fraction.

DP formulation:

dp[i][w] = maximum value achievable using first i items with weight capacity w
If we don't take item i: dp[i][w] = dp[i-1][w]
If we take item i (only if wt[i-1] <= w): dp[i][w] = val[i-1] + dp[i-1][w - wt[i-1]]
Take the maximum of both choices.

// W     = total weight capacity of knapsack
// wt    = array of item weights
// val   = array of item values (same index as wt)
// n     = number of items

function knapsack(W, wt, val, n) {
  // Build a (n+1) x (W+1) table, initialized to 0
  // dp[i][w] = best value with first i items and capacity w
  const dp = Array.from({ length: n + 1 }, () =>
    Array(W + 1).fill(0)
  );

  for (let i = 1; i <= n; i++) {
    for (let w = 1; w <= W; w++) {
      if (wt[i - 1] <= w) {
        // Choice: skip item i  OR  take item i
        dp[i][w] = Math.max(
          val[i - 1] + dp[i - 1][w - wt[i - 1]],  // take item i
          dp[i - 1][w]                               // skip item i
        );
      } else {
        // Item is too heavy for current capacity — must skip
        dp[i][w] = dp[i - 1][w];
      }
    }
  }

  return dp[n][W];    // answer: max value with all n items, full capacity W
}

// Example:
// Items: weights = [2, 3, 4], values = [3, 4, 5], W = 5
// Best: take item 1 (w=2, v=3) + item 2 (w=3, v=4) → total weight 5, value 7
console.log(knapsack(5, [2, 3, 4], [3, 4, 5], 3));   // Output: 7

Interview Preparation Guide

Technical interviews at top companies are not memory tests — they test your structured thinking, communication, and problem-solving process. Here is the complete approach:

The Interview Process — Step by Step

Step 1 — Understand the problem Read carefully. Ask clarifying questions before writing any code. "What is the range of inputs? Can there be duplicates? What should I return if the input is empty?" This shows maturity and prevents wasted effort.

Step 2 — Explore with examples Trace through 2–3 small examples by hand. One happy-path case, one edge case. This builds intuition and often reveals the pattern.

Step 3 — State your approach Before coding, explain your planned approach out loud. Mention the data structure you'll use and why. Mention the time and space complexity you expect. The interviewer can correct you before you spend 15 minutes coding the wrong approach.

Step 4 — Code clearly Use meaningful variable names. Write one logical piece at a time. Talk while you type — narrate your decisions.

Step 5 — Test and dry run Trace through your code with a sample input step by step. Then check edge cases: empty array, single element, all duplicates, maximum values, negative numbers.

Step 6 — Optimize if asked After a working solution, discuss if there's a way to reduce time or space complexity. Can an O(n²) solution become O(n log n) with a sort? Can O(n) space become O(1)?

Edge Cases to Always Consider

| Category | Examples to Test | |---|---| | Empty input | [], "", null | | Single element | [42], "a" | | All duplicates | [3, 3, 3, 3] | | Already sorted | [1, 2, 3, 4] | | Reverse sorted | [4, 3, 2, 1] | | Negative numbers | [-5, -1, 0, 3] | | Integer overflow | very large values |

Interview Question Bank

Q: Why is binary search not valid on unsorted data? Binary search works by eliminating half the search space at each step — it assumes that if the target is less than the middle element, it must be in the left half. This assumption only holds if the array is sorted. On unsorted data, the target could be anywhere, so discarding half the elements is incorrect.

Q: Difference between a stack and the recursion call stack? The recursion call stack is an implicit, OS-managed stack that stores function call frames (local variables, return address, parameters). A manually implemented stack is an explicit data structure you control in your program. Both follow LIFO order, but the call stack is invisible and has limited size — deep recursion causes a stack overflow.

Q: How to detect a cycle in a linked list? Use Floyd's Cycle Detection Algorithm (tortoise and hare). Use two pointers: slow moves one step at a time, fast moves two steps at a time. If slow and fast ever meet, a cycle exists. If fast reaches None, no cycle exists. Time: O(n), Space: O(1).

Q: BFS and DFS time complexity on an adjacency list? Both are O(V + E) where V is vertices and E is edges. Every vertex is visited once and every edge is examined once. On an adjacency matrix, both become O(V²) because checking all neighbors requires scanning a full row.

Q: How does hashing behave in the worst case? In the worst case, all keys hash to the same bucket, creating a chain of length n. Every operation (insert, search, delete) degenerates to O(n). This happens with poor hash functions or adversarial inputs. Good hash functions with load factor management keep average operations at O(1).

Q: Segment tree vs Fenwick tree — when to use which? Use a Fenwick Tree when you only need prefix sums and point updates — it's simpler to implement and slightly faster in practice. Use a Segment Tree when you need arbitrary range queries (range min, max, GCD), lazy propagation for range updates, or more complex merge operations.

Q: Greedy vs Dynamic Programming — how to decide? If you can prove that the greedy choice property holds (local optimal = global optimal), use greedy — it's simpler and faster. If the problem has overlapping subproblems and requires considering multiple choices at each step (not just the current best), use DP. When in doubt, try greedy first and find a counterexample to rule it out.

Q: Why does memoization reduce recursion time so dramatically? Without memoization, a recursive solution like Fibonacci recomputes the same subproblems exponentially — fib(40) recomputes fib(2) millions of times. Memoization stores the result of each unique subproblem the first time it is computed. Every subsequent call for that subproblem returns the cached result in O(1). This transforms exponential O(2^n) time to linear O(n) time.

FAQ

Should I start DSA with coding or theory? Start with short theory — just enough to understand what a concept is and why it exists. Then immediately solve 2–3 coding problems on that topic. Alternate between the two. Pure theory without coding builds no problem-solving muscle. Pure coding without theory leaves you unable to recognize when a pattern applies.

Which language is best for DSA interviews? The best language is the one you can write fastest without syntax errors and explain clearly. Java and C++ are most common in competitive programming. Python is the shortest and most readable. JavaScript is acceptable and growing. Choose one language and stick with it — switching mid-preparation is costly.

How much DSA is enough for placements? Core requirements: Arrays, Strings, Hashing, Linked Lists, Stacks, Queues, Trees, Graphs, Recursion, Sorting, Searching, Basic DP. Strong understanding of these covers 80% of placement interviews. After that, add Tries, Heaps, Segment Trees, and advanced DP for product-based company interviews.

Final Note

DSA is not memorization — it is a way of thinking. Every concept you learn changes how you see problems. The engineer who deeply understands arrays, trees, and graphs naturally writes code that scales.

The path to mastery is simple but not easy:

Concepts — understand why each structure exists, not just how to code it.
Deliberate practice — solve problems that are just beyond your current comfort zone.
Consistency — 45 minutes every day beats 8-hour weekend sessions.
Review — revisit problems you solved before. See if you can do them faster or with less code.

Consistency over time is the only real strategy. Start today, solve one problem at a time, and trust the process.

❓ Frequently Asked Questions

Should I start DSA with coding or theory?▾

Start with short theory, then immediately solve coding problems for the same topic.

Which language is best for DSA interviews?▾

Use the language you can write fastest and explain clearly, commonly Java, C++, Python, or JavaScript.

How much DSA is enough for placements?▾

Strong fundamentals + regular practice on arrays, strings, trees, graphs, DP, and interview simulations.

📚 More Tutorials

JavaScript

JavaScript Complete Guide — Beginner to Advanced (2026)

40 min · Beginner

Tools

Git & GitHub Complete Guide — Version Control for Students (2026)

20 min · Beginner

Database

SQL Complete Guide — Database Queries for B.Tech & Placements (2026)

30 min · Intermediate

Python

Python for Beginners: 15-Day Complete Learning Plan

15 min · Beginner

React

React Hooks Explained: useState, useEffect, useContext & More

14 min · Intermediate

Java

Top 50 Java Interview Questions — B.Tech Placements 2026

18 min · Advanced

Was this helpful?

Your feedback helps us improve notes and tutorials.

DSAIntermediate

Data Structures and Algorithms: Full Fundamentals + Interview Prep

✍️ WohoTech Team📅 Last Updated: 2026-03-11⏱️ 35 min

What is DSA and Why Does It Matter?

At its core, DSA answers two critical questions:

How should we store data? — so it can be accessed, updated, and searched as fast as possible.
How should we process data? — so we solve problems with the least time and memory possible.

DSA Fundamentals

Before diving into individual structures, you need to understand the four pillars that everything else is built on:

Logic Building Problems

Before you can solve "find the shortest path in a graph," you need to be comfortable with "is this number even or odd?" That comfort comes from drilling logic problems daily.

Why Logic Building Matters

Coding Example

# Check if a number is even or odd using the modulus operator.
# The % operator gives the remainder after division.
# If n % 2 == 0, there is no remainder, so the number is divisible by 2 — it's even.
# If n % 2 == 1, there is a remainder of 1 — the number is odd.

n = 7
if n % 2 == 0:
    print("Even")
else:
    print("Odd")

# Output: Odd

Analysis of Algorithms — Big-O Notation

When we write two different solutions to the same problem, how do we decide which one is better? We use complexity analysis.

Time Complexity

Space Complexity

Coding Example

# This single loop runs exactly n times.
# For each of the n elements, we do one print operation.
# Total operations = n → Time Complexity = O(n)
# We use no extra memory beyond the loop variable → Space Complexity = O(1)

for i in range(n):
    print(i)

Core Data Structures

Array

An array is the simplest and most widely used data structure. It stores a fixed-size collection of elements of the same type in contiguous (side-by-side) memory locations.

When to use arrays:

When you know the size of the data in advance.
When you need fast random access by index.
When memory locality matters (arrays are cache-friendly).

Limitations:

Insertion and deletion in the middle is expensive — O(n) — because elements must be shifted.
Fixed size in many languages (dynamic arrays like Python lists handle resizing internally).

Coding Example

# Find the maximum value in an array manually (without using max()).
# We start by assuming the first element is the maximum.
# Then we walk through every element.
# If we find something larger, we update our maximum.
# After the loop, max_val holds the largest element.

arr = [3, 7, 2, 9, 5]
max_val = arr[0]          # assume first element is max

for x in arr:
    if x > max_val:
        max_val = x       # update when we find something bigger

print(max_val)            # Output: 9

String

Key string operations to know:

Reversal, palindrome check
Substring search (contains, indexOf)
Anagram detection using frequency maps
Two-pointer techniques on strings

Coding Example

# Python strings support slicing with [start:stop:step].
# s[::-1] means: start from end, go to beginning, step -1 (backwards).
# This is the most Pythonic way to reverse a string — O(n) time, O(n) space.

s = "hello"
print(s[::-1])    # Output: "olleh"

# Palindrome check — a string that reads the same forwards and backwards
s2 = "racecar"
print(s2 == s2[::-1])   # Output: True

Hashing

This gives us average-case O(1) insertion, deletion, and lookup — far faster than searching through an array (O(n)) or a sorted list (O(log n)).

How it works internally:

You call hashmap["name"] = "Alice".
The hash function computes hash("name") = 47 (some integer).
The value "Alice" is stored at index 47 in the internal array.
When you later call hashmap["name"], it recomputes the hash, goes to index 47, and returns "Alice" instantly.

Coding Example

# Count how many times each number appears in the array.
# freq is a dictionary — Python's built-in hash map.
# freq.get(x, 0) returns the current count of x, or 0 if x was never seen.
# We then add 1 to that count.

arr = [1, 2, 2, 3, 1]
freq = {}

for x in arr:
    freq[x] = freq.get(x, 0) + 1

print(freq)
# Output: {1: 2, 2: 2, 3: 1}
# Meaning: 1 appears 2 times, 2 appears 2 times, 3 appears 1 time.

Linked List

A linked list is a dynamic data structure where each element (called a node) stores two things: the actual data value, and a pointer/reference to the next node in the sequence.

The tradeoff: You lose random access. To reach the 5th element, you must traverse from the head, one node at a time — O(n).

# Node is the building block of all linked list types.
# data holds the value.
# next holds a reference to the next node (None if this is the last node).

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None   # By default, a new node points to nothing

Singly Linked List

Each node points only forward to the next node. Traversal is one-directional — head to tail. Memory efficient, but you cannot go backwards.

Singly Linked List

Doubly Linked List

Circular Linked List

Stack

Core operations:

push(x) — add element to the top — O(1)
pop() — remove and return top element — O(1)
peek() — view top element without removing — O(1)

Real-world applications:

Function call stack in every programming language
Undo/redo in editors
Browser back button history
Expression evaluation (parenthesis matching, postfix evaluation)
DFS graph traversal (when done iteratively)

Coding Example

# Python lists work perfectly as stacks.
# append() adds to the end (top of the stack) — O(1)
# pop() removes from the end (top of the stack) — O(1)

stack = []
stack.append(10)   # Stack: [10]
stack.append(20)   # Stack: [10, 20]
stack.append(30)   # Stack: [10, 20, 30]

stack.pop()        # Removes 30 (last in, first out)
print(stack)       # Output: [10, 20]

Queue

A Queue follows the FIFO principle — First In, First Out. Think of a queue at a ticket counter: the person who joined first gets served first.

Core operations:

enqueue(x) — add to the rear — O(1)
dequeue() — remove from the front — O(1)

Real-world applications:

CPU task scheduling
Printer job queue
BFS graph traversal
Message queues in distributed systems (Kafka, RabbitMQ)
Call center systems

Coding Example

# Use collections.deque for queues in Python.
# list.pop(0) works but is O(n) — it shifts all elements.
# deque.popleft() is O(1) — purpose-built for queue operations.

from collections import deque

queue = deque()
queue.append(10)     # Queue: [10]  (enqueue at rear)
queue.append(20)     # Queue: [10, 20]
queue.append(30)     # Queue: [10, 20, 30]

queue.popleft()      # Removes 10 (first in, first out)
print(queue)         # Output: deque([20, 30])

Tree

Key terminology:

Root — the topmost node with no parent
Parent / Child — a node that connects to nodes below (children) or above (parent)
Leaf — a node with no children
Height — longest path from root to any leaf
Depth — distance of a node from the root

Trees are used everywhere: file systems (folders inside folders), HTML DOM, organization charts, compilers (abstract syntax trees), and database indexing.

# A basic tree node has a value and references to left and right children.
# left and right are None when a node has no children (leaf node).

class TreeNode:
    def __init__(self, val):
        self.val = val
        self.left = None    # left child
        self.right = None   # right child

Binary Tree

A Binary Tree is a tree where each node has at most 2 children — conventionally called left and right. This constraint enables efficient algorithms for search, insertion, and traversal.

Binary Tree

Binary Search Tree (BST)

A BST is a binary tree with a strict ordering property:

All values in the left subtree are less than the node's value.
All values in the right subtree are greater than the node's value.
This rule applies recursively to every node in the tree.

Binary Search Tree

Graph

A Graph is the most versatile data structure — it models relationships between entities. A graph consists of:

Vertices (nodes) — the entities (people, cities, web pages, servers)
Edges — the connections between them (friendships, roads, hyperlinks, cables)

Types of graphs:

Directed — edges have direction (Twitter follows: A follows B doesn't mean B follows A)
Undirected — edges go both ways (Facebook friends: if A is B's friend, B is A's friend)
Weighted — edges carry a cost/distance (road maps with distances)
Unweighted — all edges are equal

Storage methods:

Adjacency List — each node stores a list of its neighbors. Best for sparse graphs. O(V + E) space.
Adjacency Matrix — a 2D grid where matrix[i][j] = 1 means edge from i to j. Good for dense graphs. O(V²) space.

# Adjacency list representation using a dictionary.
# Each key is a node, and its value is a list of directly connected neighbors.

graph = {
    0: [1, 2],    # Node 0 connects to nodes 1 and 2
    1: [0, 3],    # Node 1 connects to nodes 0 and 3
    2: [0],       # Node 2 connects to node 0 only
    3: [1]        # Node 3 connects to node 1 only
}

Directed vs Undirected Graph (Visual)

Undirected vs Directed Graph

Trie (Prefix Tree)

Why Tries are powerful:

Search for a word of length m takes O(m) — independent of how many words are stored.
Prefix search (autocomplete, spell check) is natural — just traverse to the prefix node and explore everything below.

Real-world use:

Search engine autocomplete ("goo" → "google", "good", "goose")
Spell checkers
IP routing (longest prefix matching)
T9 keyboard prediction

# TrieNode stores a dictionary of children (one key per character)
# and a flag is_end marking whether a complete word ends here.

class TrieNode:
    def __init__(self):
        self.children = {}    # e.g., {'a': TrieNode, 'b': TrieNode}
        self.is_end = False   # True when a valid word ends at this node

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        node = self.root
        for char in word:
            # If this character doesn't exist as a child, create it
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]   # move down to that child
        node.is_end = True               # mark end of this word

    def search(self, word):
        node = self.root
        for char in word:
            if char not in node.children:
                return False             # character path doesn't exist
            node = node.children[char]
        return node.is_end               # only True if word was inserted

Algorithms

Searching Algorithms

Searching means finding the position of a target value within a data structure. The two fundamental approaches are:

Linear Search: Check every element one by one from the beginning. Works on any array (sorted or unsorted). Time: O(n). Simple and reliable but slow on large data.

Why Binary Search is so fast: An array of 1 billion sorted elements requires at most ~30 comparisons with binary search. Linear search would need up to 1 billion comparisons.

def linear_search(arr, key):
    # Walk through every element — O(n)
    for i in range(len(arr)):
        if arr[i] == key:
            return i       # found at index i
    return -1              # not found

def binary_search(arr, key):
    # REQUIREMENT: arr must be sorted before calling this.
    l, r = 0, len(arr) - 1

    while l <= r:
        mid = (l + r) // 2      # find the middle index

        if arr[mid] == key:
            return mid           # found!
        elif arr[mid] < key:
            l = mid + 1          # target is in the right half
        else:
            r = mid - 1          # target is in the left half

    return -1                    # not found

Sorting Algorithms

Merge Sort: Divide the array in half recursively until each part has 1 element, then merge them back in sorted order. O(n log n) time, O(n) space. Stable and reliable.

Quick Sort: Choose a pivot, partition elements around it (smaller left, larger right), then recursively sort both sides. Average O(n log n), worst case O(n²) with bad pivot choices.

# Bubble Sort implementation — educational example
# Outer loop runs n times (n passes through the array)
# Inner loop compares adjacent pairs and bubbles the largest to the end
# After each pass, the last i elements are in their correct position

arr = [5, 3, 2, 4]

for i in range(len(arr)):
    for j in range(len(arr) - 1):
        if arr[j] > arr[j + 1]:
            # Swap adjacent elements if they are in wrong order
            arr[j], arr[j + 1] = arr[j + 1], arr[j]

print(arr)    # Output: [2, 3, 4, 5]

Recursion

Recursion is a technique where a function calls itself to solve a smaller version of the same problem. Every recursive solution has two mandatory components:

Base Case — the condition that stops the recursion. Without this, you get infinite recursion and a stack overflow.
Recursive Case — the call where the function calls itself with a smaller/simpler input, progressing toward the base case.

Mental model: Think of recursion as trust. You trust that your function correctly solves the problem for n-1, and you use that result to solve for n.

Recursion is the foundation of divide-and-conquer algorithms, tree traversals, backtracking, and dynamic programming.

# Factorial: n! = n × (n-1) × (n-2) × ... × 1
# Base case: 0! = 1 (by mathematical definition)
# Recursive case: n! = n × (n-1)!

def factorial(n):
    if n == 0:
        return 1                    # BASE CASE — stop here
    return n * factorial(n - 1)    # RECURSIVE CASE — trust the smaller result

# Execution trace for factorial(4):
# factorial(4) = 4 × factorial(3)
#              = 4 × 3 × factorial(2)
#              = 4 × 3 × 2 × factorial(1)
#              = 4 × 3 × 2 × 1 × factorial(0)
#              = 4 × 3 × 2 × 1 × 1 = 24

print(factorial(5))    # Output: 120

Greedy Algorithms

Classic greedy problems:

Coin change (with standard denominations)
Activity selection (maximum non-overlapping events)
Huffman encoding (optimal prefix-free compression)
Kruskal's / Prim's algorithm for Minimum Spanning Tree

# Coin Change — Greedy approach
# Given coin denominations [10, 5, 2, 1] and amount 27,
# find the minimum number of coins needed.

# Strategy: always use the largest coin that fits.
# This works here because denominations are "nice" (each divides the next).
# Note: greedy coin change FAILS for arbitrary denominations — use DP instead.

coins = [10, 5, 2, 1]
amount = 27
count = 0

for coin in coins:
    count += amount // coin    # use as many of this coin as possible
    amount %= coin             # remaining amount after using this coin

# Trace:
# coin=10: use 2 coins (20), remaining=7
# coin=5:  use 1 coin  (5),  remaining=2
# coin=2:  use 1 coin  (2),  remaining=0
# coin=1:  use 0 coins,      remaining=0
# Total: 4 coins

print(count)    # Output: 4

Dynamic Programming (DP)

Two hallmarks of a DP problem:

Optimal Substructure — the optimal solution to the whole problem can be built from optimal solutions to subproblems.
Overlapping Subproblems — the same subproblems are solved repeatedly without DP (which is wasteful).

Two DP approaches:

Top-Down (Memoization) — start with the full problem, recurse, and cache results. Natural to write, uses recursion.
Bottom-Up (Tabulation) — start from the smallest subproblems, fill a table iteratively. More space-efficient, avoids recursion overhead.

# Fibonacci sequence using Bottom-Up DP (Tabulation)
# F(0) = 0, F(1) = 1, F(n) = F(n-1) + F(n-2)

# Naive recursion computes F(n) in O(2^n) — exponential (recalculates everything)
# DP tabulation computes F(n) in O(n) — each value computed exactly once

dp = [0, 1]                           # base cases: F(0)=0, F(1)=1

for i in range(2, 10):
    dp.append(dp[i - 1] + dp[i - 2]) # each value built from two previous values

print(dp)
# Output: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Bitwise Operations

# Check if a number is odd using bitwise AND.
# In binary, all odd numbers end in 1, all even numbers end in 0.
# n & 1 checks the last bit:
#   If last bit is 1 → odd
#   If last bit is 0 → even
# This is faster than n % 2 at the hardware level.

n = 5              # binary: 101
if n & 1:          # 101 & 001 = 001 (truthy)
    print("Odd")
else:
    print("Even")

# Output: Odd

# Other useful bit tricks:
# Count set bits (1s) in a number:
count = bin(n).count('1')

# Check if n is a power of 2:
is_power_of_two = (n > 0) and (n & (n - 1)) == 0

Graph Traversal — DFS vs BFS

Graph traversal means visiting every vertex in a graph systematically. There are two fundamental strategies, and understanding when to use each is critical.

DFS vs BFS Traversal Pattern

Depth-First Search (DFS)

DFS explores as deep as possible along one branch before backtracking. It uses a stack (either explicit or the recursion call stack) to remember where to go next.

DFS explores: root → go deep on left → backtrack → go deep on right → backtrack.

When to use DFS:

Detecting cycles in a graph
Topological sorting (task dependency ordering)
Finding connected components
Solving mazes and puzzles (backtracking)
Tree traversals (inorder, preorder, postorder)

def dfs(graph, node, visited=None):
    if visited is None:
        visited = set()

    visited.add(node)
    print(node, end=" ")              # process this node

    for neighbor in graph[node]:
        if neighbor not in visited:   # only visit unvisited neighbors
            dfs(graph, neighbor, visited)

# Time Complexity:  O(V + E)  — visits each vertex and edge once
# Space Complexity: O(V)      — recursion stack depth, visited set

Breadth-First Search (BFS)

BFS explores all neighbors at the current level before going deeper. It uses a queue to process nodes in the order they were discovered.

BFS explores: root → all level-1 neighbors → all level-2 neighbors → and so on.

When to use BFS:

Shortest path in an unweighted graph (BFS guarantees minimum hops)
Level-order tree traversal
Finding all nodes within a certain distance
Web crawling (explore pages by link distance from start)
Social network "degrees of separation"

from collections import deque

def bfs(graph, start):
    visited = set([start])
    queue = deque([start])

    while queue:
        node = queue.popleft()        # process the oldest discovered node
        print(node, end=" ")

        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)  # schedule neighbor for future processing

# Time Complexity:  O(V + E)  — visits each vertex and edge once
# Space Complexity: O(V)      — queue and visited set

Key difference: DFS goes deep (stack/recursion), BFS goes wide (queue). DFS might find A path; BFS finds THE shortest path.

Advanced DSA Topics — Java

Segment Tree

A Segment Tree is a binary tree used for efficiently solving range query problems — problems where you need to repeatedly query or update a contiguous range of an array.

What it enables:

Range Sum Query: "What is the sum of elements from index 3 to index 7?"
Range Minimum/Maximum Query: "What is the minimum value between index 2 and index 9?"
Point Update: "Update element at index 5 to a new value."

All three operations in O(log n) time, compared to O(n) for naive approaches.

class SegmentTree {
    int[] tree;    // internal tree array — size 4*n is safe upper bound
    int n;

    SegmentTree(int[] arr) {
        n = arr.length;
        tree = new int[4 * n];
        build(arr, 0, 0, n - 1);    // build from root (node 0, range 0 to n-1)
    }

    // Build the tree bottom-up.
    // node: current tree node index
    // start, end: the range of original array this node covers
    void build(int[] arr, int node, int start, int end) {
        if (start == end) {
            // Leaf node — covers a single element
            tree[node] = arr[start];
        } else {
            int mid = (start + end) / 2;
            // Build left child (covers start to mid)
            build(arr, 2 * node + 1, start, mid);
            // Build right child (covers mid+1 to end)
            build(arr, 2 * node + 2, mid + 1, end);
            // Internal node stores the sum of its two children
            tree[node] = tree[2 * node + 1] + tree[2 * node + 2];
        }
    }

    // Query sum for range [l, r]
    int query(int node, int start, int end, int l, int r) {
        if (r < start || end < l) return 0;          // completely outside range
        if (l <= start && end <= r) return tree[node]; // completely inside range
        int mid = (start + end) / 2;
        int leftSum  = query(2 * node + 1, start, mid, l, r);
        int rightSum = query(2 * node + 2, mid + 1, end, l, r);
        return leftSum + rightSum;
    }
}

Fenwick Tree (Binary Indexed Tree)

A Fenwick Tree (also called Binary Indexed Tree or BIT) solves the same prefix-sum problems as a Segment Tree, but with a simpler implementation and lower constant factors.

When to prefer Fenwick over Segment Tree:

Only need prefix sums / point updates (not arbitrary range queries)
Want shorter, simpler code
Slightly better cache performance and memory usage

class FenwickTree {
    int[] bit;   // BIT array (1-indexed, so size n+1)
    int n;

    FenwickTree(int n) {
        this.n = n;
        bit = new int[n + 1];    // index 0 is unused
    }

    // Add delta to index i and propagate to all affected ranges
    // i & (-i) gives the lowest set bit — determines how far to jump
    void update(int i, int delta) {
        while (i <= n) {
            bit[i] += delta;
            i += i & (-i);        // move to the next responsible index
        }
    }

    // Return prefix sum from index 1 to i
    int query(int i) {
        int sum = 0;
        while (i > 0) {
            sum += bit[i];
            i -= i & (-i);        // move to the parent range
        }
        return sum;
    }

    // Range sum from index l to r
    int rangeQuery(int l, int r) {
        return query(r) - query(l - 1);
    }
}

Dijkstra's Shortest Path Algorithm

Dijkstra's algorithm finds the shortest path from a single source vertex to all other vertices in a weighted graph with non-negative edge weights.

Why non-negative weights only? If you relax a node and later discover a shorter path (via a negative edge), Dijkstra's assumption breaks. Use Bellman-Ford for graphs with negative edges.

Time complexity: O((V + E) log V) using a binary min-heap.

import java.util.*;

class Dijkstra {
    // Pair represents an edge destination and its weight
    static class Pair {
        int node, dist;
        Pair(int n, int d) { node = n; dist = d; }
    }

    static int[] dijkstra(List<Pair>[] graph, int src) {
        int V = graph.length;
        int[] dist = new int[V];
        Arrays.fill(dist, Integer.MAX_VALUE);   // all distances start at infinity
        dist[src] = 0;                           // source to itself is 0

        // Min-heap ordered by distance
        PriorityQueue<Pair> pq = new PriorityQueue<>(
            Comparator.comparingInt(a -> a.dist));
        pq.add(new Pair(src, 0));

        while (!pq.isEmpty()) {
            Pair curr = pq.poll();
            int u = curr.node;

            for (Pair neighbor : graph[u]) {
                int v = neighbor.node;
                int weight = neighbor.dist;

                // If we found a shorter path to v through u, update it
                if (dist[u] + weight < dist[v]) {
                    dist[v] = dist[u] + weight;
                    pq.add(new Pair(v, dist[v]));
                }
            }
        }
        return dist;    // dist[i] = shortest distance from src to node i
    }
}

Advanced DSA Topics — JavaScript

Segment Tree (JavaScript)

class SegmentTree {
  constructor(arr) {
    this.n = arr.length;
    // 4 * n is a safe allocation size for the segment tree array
    this.tree = new Array(4 * this.n).fill(0);
    this.build(arr, 0, 0, this.n - 1);
  }

  // Recursively build the tree
  // node: current index in this.tree
  // start, end: range of original array covered by this node
  build(arr, node, start, end) {
    if (start === end) {
      // Leaf node — stores a single array element
      this.tree[node] = arr[start];
      return;
    }
    const mid = Math.floor((start + end) / 2);
    this.build(arr, 2 * node + 1, start, mid);        // build left child
    this.build(arr, 2 * node + 2, mid + 1, end);      // build right child
    // Parent node = sum of both children
    this.tree[node] = this.tree[2 * node + 1] + this.tree[2 * node + 2];
  }

  // Query sum for range [l, r]
  query(node, start, end, l, r) {
    if (r < start || end < l) return 0;               // out of range
    if (l <= start && end <= r) return this.tree[node]; // fully in range
    const mid = Math.floor((start + end) / 2);
    return (
      this.query(2 * node + 1, start, mid, l, r) +
      this.query(2 * node + 2, mid + 1, end, l, r)
    );
  }
}

Fenwick Tree (JavaScript)

class FenwickTree {
  constructor(n) {
    this.n = n;
    // 1-indexed: index 0 is intentionally unused
    this.bit = new Array(n + 1).fill(0);
  }

  // Update index i by adding delta
  // i & -i isolates the lowest set bit — determines which ranges to update
  update(i, delta) {
    while (i <= this.n) {
      this.bit[i] += delta;
      i += i & -i;           // jump to the next affected range
    }
  }

  // Return prefix sum from index 1 to i
  query(i) {
    let sum = 0;
    while (i > 0) {
      sum += this.bit[i];
      i -= i & -i;           // move to parent range
    }
    return sum;
  }

  // Range sum from l to r (inclusive)
  rangeQuery(l, r) {
    return this.query(r) - this.query(l - 1);
  }
}

0/1 Knapsack — Dynamic Programming (JavaScript)

DP formulation:

dp[i][w] = maximum value achievable using first i items with weight capacity w
If we don't take item i: dp[i][w] = dp[i-1][w]
If we take item i (only if wt[i-1] <= w): dp[i][w] = val[i-1] + dp[i-1][w - wt[i-1]]
Take the maximum of both choices.

// W     = total weight capacity of knapsack
// wt    = array of item weights
// val   = array of item values (same index as wt)
// n     = number of items

function knapsack(W, wt, val, n) {
  // Build a (n+1) x (W+1) table, initialized to 0
  // dp[i][w] = best value with first i items and capacity w
  const dp = Array.from({ length: n + 1 }, () =>
    Array(W + 1).fill(0)
  );

  for (let i = 1; i <= n; i++) {
    for (let w = 1; w <= W; w++) {
      if (wt[i - 1] <= w) {
        // Choice: skip item i  OR  take item i
        dp[i][w] = Math.max(
          val[i - 1] + dp[i - 1][w - wt[i - 1]],  // take item i
          dp[i - 1][w]                               // skip item i
        );
      } else {
        // Item is too heavy for current capacity — must skip
        dp[i][w] = dp[i - 1][w];
      }
    }
  }

  return dp[n][W];    // answer: max value with all n items, full capacity W
}

// Example:
// Items: weights = [2, 3, 4], values = [3, 4, 5], W = 5
// Best: take item 1 (w=2, v=3) + item 2 (w=3, v=4) → total weight 5, value 7
console.log(knapsack(5, [2, 3, 4], [3, 4, 5], 3));   // Output: 7

Interview Preparation Guide

Technical interviews at top companies are not memory tests — they test your structured thinking, communication, and problem-solving process. Here is the complete approach:

The Interview Process — Step by Step

Step 2 — Explore with examples Trace through 2–3 small examples by hand. One happy-path case, one edge case. This builds intuition and often reveals the pattern.

Step 4 — Code clearly Use meaningful variable names. Write one logical piece at a time. Talk while you type — narrate your decisions.

Step 5 — Test and dry run Trace through your code with a sample input step by step. Then check edge cases: empty array, single element, all duplicates, maximum values, negative numbers.

Edge Cases to Always Consider

Interview Question Bank

FAQ

Final Note

The path to mastery is simple but not easy:

Concepts — understand why each structure exists, not just how to code it.
Deliberate practice — solve problems that are just beyond your current comfort zone.
Consistency — 45 minutes every day beats 8-hour weekend sessions.
Review — revisit problems you solved before. See if you can do them faster or with less code.

Consistency over time is the only real strategy. Start today, solve one problem at a time, and trust the process.