The IP package provides a wide array of methods for working with both IPv4 and IPv6 addresses and ranges :
An IP address is a numerical label assigned to each device connected to a computer network that uses the Internet Protocol for communication. The Internet Protocol uses those labels to identify nodes such as host or network interface for relaying datagrams between them across network boundaries. There are two versions of the Internet Protocol —version 4 and version 6— which differ in many respects.
Code is mostly C for increased performances and is based on the ip4r PostgreSQL extension. IP objects were designed to behave as much as possible as R vectors but there are some pitfalls. Please read the caveat section at the end of this vignette.
The IP package provides six different IP classes :
Let’s start with IPv4 input. Calling the ipv4() and ipv4r() functions creates IPv4 and IPv4r objects respectively from strings :
## [1] "192.168.0.0"
## [1] "192.168.0.0/16"
## [1] "192.168.0.0/16"
## same thing using an IPv4 object and an integer giving the number of addresses in the range
ipv4r(ipv4("192.168.0.0"), as.integer(2L^16 -1) )
## bool: nna Rip_ipv4_op2_bool_gt_0 1 1
## [1] "192.168.0.0/16"
## bool: nna Rip_ipv4_op2_bool_gt_0 1 1
## [1] "192.168.0.0/16"
Likewise, the ipv6() and ipv6r() functions creates IPv6 and IPv6r objects :
## [1] "fe80::"
## [1] "fe80::/10"
## [1] "fe80::/10"
## Error in .local(.Object, ...) : ipv6s should have the same NA
## arith: nna Rip_ipv6_op2_arith_addv6_0 1
## bool: nna Rip_ipv6_op2_bool_gt_0 1 1
## [1] "fe80::/10"
From the penultimate example, we can see that when input fails for any reason, the returned IP value is NA. In addition, numeric inputs are limited to values lesser than 2^54 because otherwise operations may result in a loss of precision since 2^54 is the size of mantissa of IEEE 754 floating point numbers. Please refer to the Arith-methods section of the manual for more information.
And, IP and IPr objects are created as follow :
## [1] "192.168.0.0" "fe80::"
## [1] 4 6
## [1] "192.168.0.0" "fe80::"
## [1] "192.168.0.0" "fe80::"
## [1] "192.168.0.0/16" "fe80::/10"
## same thing using dash notation
ipr(c( "192.168.0.0-192.168.255.255", "fe80::-febf:ffff:ffff:ffff:ffff:ffff:ffff:ffff") )
## [1] "192.168.0.0/16" "fe80::/10"
The IP package also provides direct input from integers. Note that input values are treated as unsigned integers :
## [1] "0.0.0.1"
## Warning in ipv4(-1L): negative values
## bool: nna Rip_ipv4_op2_bool_gt_0 1 1
## [1] TRUE
## [1] NA
## [1] "::1"
## bool: nna Rip_ipv6_op2_bool_gt_0 1 1
## [1] TRUE
## Warning in ipv6(NA_integer_): 1 NA introduced during input_int32 IPv6 operation
## [1] NA
The dash notation for addresses ranges gives greater flexibility over CIDR notations as it enables input of arbitrary ranges :
## [1] "192.168.0.0-192.168.0.9"
## [1] "192.168.0.10-192.168.0.19"
Also, some methods are specific to certain classes. For instance, the IP package defines getters for address ranges :
## [1] "192.168.0.0"
## [1] "192.168.255.255"
as well as for IP objects :
## [1] "192.168.0.0" NA
## [1] NA "fe80::"
## [1] "192.168.0.0"
## [1] "fe80::"
Note that some methods only work for IPv4 and IPv6 objects and not for IP objects. Partly because some methods have not been implemented yet but mostly by design. Despite their similarities, IPv4 and IPv6 are different protocols. Therefore, at some point or another you’ll have to deal with them separately for instance when masking, sorting or matching addresses. In addition IP methods are a bit slower. And, despite fifteen years of IPv6 deployment, what you still get today is mostly IPv4 addresses in many circumstances anyway.
IP* objects were designed to behave as much as possible like base R atomic vectors. Hence, you can input addresses from named vectors :
## router host1
## "192.168.0.0" "192.168.0.1"
## [1] "router" "host1"
Like any R vector, IP addresses supports vector slicing :
## router host1
## "192.168.0.0" "192.168.0.1"
## host1 router
## "192.168.0.1" "192.168.0.0"
and vector assignment :
x[3:4] <- c( host2 = '192.168.0.2', host3 = '192.168.0.3' )
x[5] <- c( host4 = -1062731772L)
data.frame(n=names(x), x=x)
## n x
## router router 192.168.0.0
## host1 host1 192.168.0.1
## host2 host2 192.168.0.2
## host3 host3 192.168.0.3
## host4 host4 192.168.0.4
Note that, when doing assignment, new values are automatically coerced. We can also grow vectors :
## [1] NA NA "0.0.0.3" NA "::5"
## [1] NA NA
In addition to vector slicing and assignment, the IP package also provides methods for
Multiplication, division and modulo are not implemented yet. The !, &, | and ^ operators behave differently from their base R counterparts in that they perform bitwise operations much like in the C language :
In addition, the ipv4.netmask(n) and ipv4.hostmask(n) (and their corresponding IPv6 functions ipv6.netmask(n) and ipv6.hostmask(n)) returns a net and host mask respectively of size n.
## arith: nna Rip_ipv4_op2_arith_subv4_0 1
## bool: nna Rip_ipv4_op2_bool_eq_0 1 1
## [1] TRUE
## bool: nna Rip_ipv4_op2_bool_eq_0 1 1
## [1] TRUE
## generate high end of the range form a mask
ipv4r(x, x|ipv4.hostmask(16) )==ipv4r('192.168.0.0/16')
## bool: nna Rip_ipv4_op2_bool_gt_0 1 1
## bool: nna Rip_ipv4r_op2_bool_eq_0 1 1
## [1] TRUE
Same as any R vectors, binary operators apply to vectors of different length :
## arith: na Rip_ipv4_op2_arith_addv4_0
## [1] "0.0.0.0" "0.0.0.2" "0.0.0.2" "0.0.0.4" "0.0.0.4" "0.0.0.6"
## bool: na Rip_ipv4_op2_bool_lt_0
## [1] FALSE FALSE TRUE TRUE TRUE TRUE
## arith: na Rip_ipv4_op2_arith_addv4_0
## [1] "0.0.0.0" "0.0.0.2" "0.0.0.2" "0.0.0.4" "0.0.0.4" "0.0.0.6" "0.0.0.6"
IP* arithmetic behaves like R integer arithmetic. Hence, overflow means NA :
and so does any operation with NA
## bool: na Rip_ipv4_op2_bool_eq_0
## [1] NA TRUE
There are no Summary (min(), max(),…) methods yet. But table() works :
## x
## 192.168.0.0 192.168.0.1 192.168.0.2 192.168.0.3 192.168.0.4
## 10 10 6 9 10
IPv6 and IP objects behave similarly and most of what precedes works for them (with some obvious modifications). There are some exceptions for IP objects because a few methods are still missing or do not apply like when assigning to an IP vector :
x=ip(c(
router = '192.168.0.0'
, host1 = '192.168.0.1'
))
## does not work yet
try(x[3:4] <- c( host2 = '192.168.0.2', host3 = '192.168.0.3' ))
## Error in `[<-`(`*tmp*`, 3:4, value = c(host2 = "192.168.0.2", host3 = "192.168.0.3" :
## unimplemented assign method for IP object
## we need to convert to IP first
x[3:4] <- ip(c( host2 = '192.168.0.2', host3 = '192.168.0.3' ))
## does not work because we cannot tell the IP version
try(x[5] <- c( host4 = -1062731772L))
## Error in `[<-`(`*tmp*`, 5, value = c(host4 = -1062731772L)) :
## unimplemented assign method for IP object
The IP package also provides methods for IP* lookup and DNS resolution. But we’ll cover that in another vignette.
In order to make R believe that IP are regular vectors, every IP objects inherits from the integer class. This means that by virtue of method dispatching, if R does not find a method for an IP object but one exists for an integer vector, the latter will be called and this may not return an IP object or possibly have undesirable side effects that may cause an unpredictable behavior. For example, in an early version of the package, multiplication returned a potentially messed up IP object. Now, this has been fixed for the most common cases but it is virtually impossible to fully prevent it as it is a feature of R object-oriented programming. Therefore, if something strange happens, it might be the result of calling a method not defined by the IP package on an IP object.