Network+Intermediate13 min read

What Does SIP Mean?

Also known as: Session Initiation Protocol, SIP trunk, VoIP SIP

Reviewed byJohnson Ajibi· Senior Network & Security Engineer · MSc IT Security

This page mentions older exam versions. See the Current Exam Context and Legacy Exam Context sections below for the updated mapping.

On This Page

Quick Definition

SIP (Session Initiation Protocol) is an application-layer control protocol defined in RFC 3261 that establishes, modifies, and terminates multimedia communication sessions such as voice over IP (VoIP) calls, video conferences, and instant messaging. It works by exchanging text-based messages between endpoints (user agents) and servers (proxy, redirect, registrar) to locate participants, negotiate session parameters, and manage call flow. SIP is independent of the underlying transport protocols (UDP, TCP, TLS) and does not carry media itself; instead, it uses protocols like RTP for media transport and SDP for session description. Its primary purpose is to provide a flexible, scalable signaling framework that enables interoperability between different vendors and networks, replacing older circuit-switched signaling systems like SS7 in modern IP telephony environments.

Must Know for Exams

On the Network+ exam (N10-008/009), SIP is tested primarily under Domain 2.0 (Network Implementation) and Domain 3.0 (Network Operations), specifically in the context of VoIP and unified communications.

Key exam focus areas include: (1) SIP's role as a signaling protocol vs. RTP as a media protocol—candidates must know that SIP sets up and tears down calls, while RTP carries the actual voice/video. (2) SIP default ports: UDP/TCP 5060 for unencrypted signaling, TCP 5061 for TLS-encrypted signaling.

(3) SIP methods: INVITE, ACK, BYE, REGISTER—know which method is used for call initiation, call termination, and registration. (4) SIP vs. H.323: SIP is text-based and more flexible; H.

323 is binary and older. (5) SIP in the OSI model: application layer (Layer 7). (6) SIP and NAT traversal issues: why SIP can break through NAT and how STUN/TURN/ICE help. (7) SIP trunking: connecting a private PBX to the PSTN via SIP.

Exam questions often present a scenario where a VoIP call fails and ask which protocol is responsible for signaling or media—the correct answer is SIP for signaling, RTP for media. Another common question asks which port SIP uses by default—answer 5060 (UDP/TCP).

Simple Meaning

Think of SIP as the digital equivalent of a restaurant host. When you want to have a conversation (a call) with someone, SIP is the host who finds the person, checks if they are available, and sets up a table (the connection) for you. The host doesn't cook the food or serve the meal—that's the media (your voice) carried by RTP.

SIP just handles the introductions, the seating, and the goodbye. If you need to add someone else to the call (like pulling up another chair), SIP handles that too. And when the meal is over, the host clears the table and ends the session.

Without SIP, your VoIP phone would have no way to 'find' the other person or negotiate how to communicate—it would be like walking into a restaurant with no host, no reservations, and no way to know if your friend is even there.

Full Technical Definition

SIP (Session Initiation Protocol) is an application-layer signaling protocol standardized by the IETF in RFC 3261 (obsoleting RFC 2543). It operates at Layer 7 of the OSI model and is used to create, modify, and terminate multimedia sessions between two or more participants. SIP is text-based, similar to HTTP, and uses a request-response model with methods such as INVITE, ACK, BYE, CANCEL, REGISTER, OPTIONS, and SUBSCRIBE.

Each SIP message contains a start-line, headers (From, To, Call-ID, CSeq, Via, Contact, etc.), and an optional message body (typically carrying a Session Description Protocol or SDP payload that describes media capabilities like codecs, IP addresses, and ports). SIP endpoints are called User Agents (UAs), which can be clients (UAC) or servers (UAS).

Network infrastructure includes proxy servers (route requests), redirect servers (provide alternate addresses), and registrar servers (accept REGISTER requests and update location databases). SIP uses a URI scheme (sip:user@domain) to identify users. It can run over UDP (port 5060), TCP (port 5060), or TLS (port 5061).

SIP does not transport media; media streams are handled separately by RTP/RTCP. Compared to H.323, SIP is simpler, more modular, and more internet-friendly, making it the dominant protocol for modern VoIP systems, including unified communications and WebRTC.

Real-Life Example

At Acme Corp, an employee named Alice wants to call Bob in another office. Alice picks up her SIP desk phone and dials Bob's extension. Her phone (a SIP User Agent Client) sends an INVITE request to the company's SIP proxy server at sip.

acmecorp.com. The proxy server looks up Bob's current location in the registrar database—Bob is registered from his softphone on his laptop. The proxy forwards the INVITE to Bob's softphone.

Bob's phone rings and sends back a 180 Ringing response, which the proxy relays to Alice. Bob answers, and his phone sends a 200 OK response containing an SDP body with his media capabilities (e.g.

, G.711 codec, IP 10.1.2.3, port 20000). Alice's phone sends an ACK to confirm, and an RTP media stream is established directly between Alice's desk phone and Bob's laptop. They talk for 10 minutes.

When Bob hangs up, his phone sends a BYE request, Alice's phone responds with 200 OK, and the session is terminated. The SIP proxy logged the call for billing and troubleshooting.

Why This Term Matters

SIP is the backbone of modern VoIP and unified communications. IT professionals must understand SIP to deploy, configure, and troubleshoot IP phone systems, softphones, and SIP trunks. Without SIP knowledge, you cannot diagnose call setup failures, one-way audio, registration issues, or interoperability problems between different vendors.

SIP also integrates with firewalls, NAT, and session border controllers (SBCs), making it critical for network security and QoS. On the career side, SIP expertise is a high-demand skill for network administrators, VoIP engineers, and unified communications specialists. Mastery of SIP directly translates to higher-level certifications like CCNP Collaboration and real-world problem-solving in enterprise telephony environments.

How It Appears in Exam Questions

SIP appears in Network+ exam questions in several patterns. Pattern 1: 'A user reports that they can make VoIP calls but cannot hear the other party. Which protocol is most likely misconfigured?'

Wrong answers include SIP (because it's signaling), TCP, or UDP. Correct answer: RTP (media stream). Pattern 2: 'Which protocol is used to register a VoIP phone with a call server?'

Wrong answers: RTP, H.323, or DNS. Correct answer: SIP (REGISTER method). Pattern 3: 'A technician needs to secure SIP signaling between two sites. Which port should be allowed through the firewall?'

Wrong answers: 5060 UDP, 80 TCP, 443 TCP. Correct answer: 5061 TCP (TLS-encrypted SIP). Pattern 4: 'Which of the following is a text-based signaling protocol used for VoIP?' Wrong answers: RTP (media), H.

323 (binary), or MGCP (media gateway control). Correct answer: SIP. To identify the correct answer, always ask: 'Is this about setting up/tearing down the call (SIP) or about the actual voice/video (RTP)?'

and 'Is it encrypted (5061) or unencrypted (5060)?'

Practise SIP Questions

Test your understanding with exam-style practice questions.

Practise

Example Scenario

Step 1: Alice's SIP phone sends an INVITE message to the SIP proxy server at sip.company.com. Step 2: The proxy server queries the location database and finds Bob's SIP URI (bob@company.

com) registered from his softphone at IP 192.168.1.50. Step 3: The proxy forwards the INVITE to Bob's softphone. Bob's phone rings and sends a 180 Ringing response back through the proxy to Alice.

Step 4: Bob answers the call; his softphone sends a 200 OK response containing an SDP body with his IP address and codec preferences. Step 5: Alice's phone sends an ACK to confirm, and an RTP media stream is established directly between Alice's desk phone and Bob's laptop. They talk.

When Bob hangs up, his phone sends a BYE, Alice responds 200 OK, and the session ends.

Common Mistakes

SIP carries the actual voice or video data.

SIP is a signaling protocol—it only sets up, modifies, and tears down sessions. The actual media (voice/video) is carried by RTP (Real-time Transport Protocol). Confusing the two is a common exam trap.

SIP signals, RTP sings. If it's about call setup, think SIP. If it's about voice quality, think RTP.

SIP always uses TCP port 5060.

SIP can use UDP or TCP on port 5060 for unencrypted signaling. It uses TCP port 5061 for TLS-encrypted signaling. The default is often UDP, but TCP is also supported. The exam may ask which port is used for encrypted SIP.

5060 = unencrypted (UDP or TCP), 5061 = encrypted (TCP only).

SIP is the same as H.323.

Both are VoIP signaling protocols, but H.323 is older, binary-based, and more complex. SIP is text-based, simpler, and more widely used in modern IP telephony. The exam tests the difference.

SIP = text-based, modern, flexible. H.323 = binary, older, monolithic.

Exam Trap — Don't Get Fooled

{"trap":"The most dangerous trap is believing that SIP is responsible for transporting voice or video data. Many candidates see 'VoIP' and immediately choose SIP for everything, including media transport. The wrong answer 'SIP' is often offered as a distractor when the correct answer is RTP."

,"why_learners_choose_it":"Learners associate SIP with VoIP so strongly that they assume it handles all aspects of the call, including the actual voice. The question stem might say 'a user can make calls but cannot hear the other party'—the candidate thinks 'SIP is broken' because they don't separate signaling from media.","how_to_avoid_it":"Use the mantra: 'SIP signals, RTP sings.'

Whenever you see a question about voice quality, one-way audio, or media streams, immediately eliminate SIP and look for RTP. If the question is about call setup, registration, or teardown, SIP is the answer."

Commonly Confused With

SIPvsRTP (Real-time Transport Protocol)

SIP is a signaling protocol that sets up and tears down sessions; RTP is a transport protocol that carries the actual audio/video data. SIP uses text messages (INVITE, BYE) over ports 5060/5061; RTP uses UDP and dynamic ports (typically 16384-32767) for media streams.

When Alice calls Bob, SIP handles the 'hello, are you there?' and 'goodbye', while RTP carries their actual conversation.

SIPvsH.323

H.323 is an older, binary-based protocol suite for VoIP and videoconferencing. SIP is text-based (like HTTP), simpler, and more scalable. H.323 defines many sub-protocols (H.225, H.245); SIP is modular and relies on SDP for media negotiation.

A legacy video conferencing system might use H.323, while a modern VoIP phone system uses SIP.

Step-by-Step Breakdown

1

Step 1 — Registration

A SIP phone (User Agent) sends a REGISTER request to a SIP registrar server, providing its current IP address and SIP URI. The registrar stores this binding in a location database so that incoming calls can be routed to the correct device.

2

Step 2 — Call Initiation (INVITE)

The caller's UA sends an INVITE request to the callee's SIP URI. The INVITE includes a Session Description Protocol (SDP) body that lists the caller's supported codecs, IP address, and port for media. This request may be routed through proxy servers.

3

Step 3 — Provisional Responses

The callee's UA sends provisional responses like 100 Trying (from proxy) and 180 Ringing (from callee) to indicate progress. These responses are relayed back to the caller via the same proxy path.

4

Step 4 — Call Acceptance (200 OK)

When the callee answers, their UA sends a 200 OK response. This response contains an SDP body with the callee's media capabilities (codec, IP, port). The caller's UA receives this and knows how to send media.

5

Step 5 — Acknowledgment and Media Flow

The caller's UA sends an ACK request to confirm receipt of the 200 OK. At this point, a direct RTP media stream is established between the two UAs. The call is active until one party sends a BYE request to terminate the session.

Practical Mini-Lesson

SIP (Session Initiation Protocol) is the standard for initiating, managing, and ending real-time communication sessions over IP networks. Think of it as the 'traffic cop' for VoIP calls. Core concept: SIP is a signaling protocol—it does not carry voice or video.

That job belongs to RTP (Real-time Transport Protocol). SIP uses text-based messages (like HTTP) to invite users, negotiate capabilities, and terminate sessions. How it works: A SIP endpoint (User Agent) sends an INVITE request to a SIP server (proxy or redirect).

The server locates the called party using a location service (often populated by REGISTER messages). The INVITE contains a Session Description Protocol (SDP) payload that describes the caller's media capabilities (codecs, IP, ports). The called party responds with 180 Ringing, then 200 OK (with its own SDP).

The caller sends ACK, and a direct RTP media stream is established. To end the call, either party sends BYE, and the other responds with 200 OK. Comparison to alternatives: H.323 is an older, binary-based protocol suite that is more complex and less internet-friendly.

SIP is simpler, more scalable, and dominates modern VoIP. Configuration notes: SIP typically uses UDP or TCP port 5060 for unencrypted signaling, and TCP port 5061 for TLS-encrypted signaling. NAT can break SIP because the IP addresses in SDP may be private; solutions include STUN, TURN, ICE, or Session Border Controllers (SBCs).

Key takeaway: On the Network+ exam, remember that SIP handles call setup/teardown (signaling) and uses ports 5060/5061, while RTP handles the actual media (voice/video) and uses dynamic UDP ports (typically 16384-32767).

Memory Tip

SIP = 'Set up, Invite, Party' — SIP is the host that sets up the party (call), invites people (INVITE), and cleans up when the party ends (BYE). Remember: SIP signals, RTP sings.

Covered in These Exams

Current Exam Context

Current exam versions that test this topic — use these objectives when studying.

Legacy Exam Context

Older materials may mention these exam versions, but learners should use the current objectives for their target exam.

N10-008N10-009(current version)

Related Glossary Terms

Frequently Asked Questions

What is the difference between SIP and RTP?

SIP is a signaling protocol that sets up, modifies, and terminates communication sessions. RTP (Real-time Transport Protocol) carries the actual audio/video data. SIP uses text messages over ports 5060/5061; RTP uses UDP and dynamic ports for media streams. Think of SIP as the host and RTP as the conversation.

Does SIP use TCP or UDP?

SIP can use both. Unencrypted SIP typically uses UDP or TCP on port 5060. Encrypted SIP (SIP over TLS) uses TCP on port 5061. UDP is common for simple calls because it's faster, but TCP is used when reliability or encryption is needed. The exam expects you to know both ports and their purposes.

Can SIP work through NAT?

SIP often has trouble with NAT because the IP addresses and ports in the SDP body are private, not public. Solutions include STUN (Session Traversal Utilities for NAT), TURN (Traversal Using Relays around NAT), ICE (Interactive Connectivity Establishment), or using a Session Border Controller (SBC) to mediate the connection.

What is a SIP trunk?

A SIP trunk is a virtual connection that allows a private PBX to connect to the Public Switched Telephone Network (PSTN) over the internet using SIP. It replaces traditional analog or ISDN phone lines. SIP trunks are cost-effective and scalable, supporting multiple concurrent calls over a single connection.

Why is SIP important for the Network+ exam?

SIP is a key protocol for VoIP and unified communications, which are covered in Network+ domains on network implementation and operations. You need to know its role (signaling), default ports (5060/5061), and how it differs from RTP. Questions often test your ability to distinguish between signaling and media protocols.

Summary

1. SIP (Session Initiation Protocol) is an application-layer signaling protocol that establishes, modifies, and terminates multimedia sessions like VoIP calls and video conferences. 2.

SIP is text-based, uses ports 5060 (UDP/TCP) for unencrypted signaling and 5061 (TCP) for encrypted signaling, and relies on RTP for actual media transport. 3. On the Network+ exam, remember that SIP is for signaling (not media), uses INVITE/BYE/REGISTER methods, and is often confused with RTP—if the question is about call setup or teardown, the answer is SIP; if it's about voice or video quality, the answer is RTP.