The issue isn't so much how many lines of code it takes, two or five, not much difference. The question is more whether it will work beyond the example you posted here.
I haven't come across this sort of thing in the wild, but I had a go at constructing another example that I thought could conceivably exist.
I've since come across a couple more cases and added them to the test suite.
I've also included a table drawn using box-drawing characters. You don't come across this much these days, but for completeness' sake it's here.
x1 <- "+------------+------+------+----------+--------------------------+| Date | Emp1 | Case | Priority | PriorityCountinLast7days |+------------+------+------+----------+--------------------------+| 2018-06-01 | A | A1 | 0 | 0 || 2018-06-03 | A | A2 | 0 | 1 || 2018-06-02 | B | B2 | 0 | 2 || 2018-06-03 | B | B3 | 0 | 3 |+------------+------+------+----------+--------------------------+"x2 <- "–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– Date | Emp1 | Case | Priority | PriorityCountinLast7days –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 2018-06-01 | A | A|1 | 0 | 0 2018-06-03 | A | A|2 | 0 | 1 2018-06-02 | B | B|2 | 0 | 2 2018-06-03 | B | B|3 | 0 | 3 ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––"x3 <- " Maths | English | Science | History | Class 0.1 | 0.2 | 0.3 | 0.2 | Y2 0.9 | 0.5 | 0.7 | 0.4 | Y1 0.2 | 0.4 | 0.6 | 0.2 | Y2 0.9 | 0.5 | 0.2 | 0.7 | Y1"x4 <- " Season | Team | W | AHWO-------------------------------------1 | 2017/2018 | TeamA | 2 | 1.752 | 2017/2018 | TeamB | 1 | 1.853 | 2017/2018 | TeamC | 1 | 1.704 | 2016/2017 | TeamA | 1 | 1.495 | 2016/2017 | TeamB | 3 | 1.516 | 2016/2017 | TeamC | 2 | N/A"x5 <- " A B C┌───┬───┬───┐A │ 5 │ 1 │ 3 │├───┼───┼───┤B │ 2 │ 5 │ 3 │├───┼───┼───┤C │ 3 │ 4 │ 4 │└───┴───┴───┘"x6 <- "------------------------------------------------------------|date |Material |Description ||----------------------------------------------------------||10/04/2013 |WM.5597394 |PNEUMATIC ||11/07/2013 |GB.D040790 |RING |------------------------------------------------------------------------------------------------------------------------|date |Material |Description ||----------------------------------------------------------||08/06/2013 |WM.4M01004A05 |TOUCHEUR ||08/06/2013 |WM.4M010108-1 |LEVER |------------------------------------------------------------"
My go at a function
f <- function(x=x6, header=TRUE, rem.dup.header=header, na.strings=c("NA", "N/A"), stringsAsFactors=FALSE, ...) { # read each row as a character string x <- scan(text=x, what="character", sep="\n", quiet=TRUE) # keep only lines containing alphanumerics x <- x[grep("[[:alnum:]]", x)] # remove vertical bars with trailing or leading space x <- gsub("\\|? | \\|?", " ", x) # remove vertical bars at beginning and end of string x <- gsub("\\|?$|^\\|?", "", x) # remove vertical box-drawing characters x <- gsub("\U2502|\U2503|\U2505|\U2507|\U250A|\U250B", " ", x) if (rem.dup.header) { dup.header <- x == x[1] dup.header[1] <- FALSE x <- x[!dup.header] } # read the result as a table read.table(text=paste(x, collapse="\n"), header=header, na.strings=na.strings, stringsAsFactors=stringsAsFactors, ...) }lapply(c(x1, x2, x3, x4, x5, x6), f)
Output
[[1]] Date Emp1 Case Priority PriorityCountinLast7days1 2018-06-01 A A1 0 02 2018-06-03 A A2 0 13 2018-06-02 B B2 0 24 2018-06-03 B B3 0 3[[2]] Date Emp1 Case Priority PriorityCountinLast7days1 2018-06-01 A A|1 0 02 2018-06-03 A A|2 0 13 2018-06-02 B B|2 0 24 2018-06-03 B B|3 0 3[[3]] Maths English Science History Class1 0.1 0.2 0.3 0.2 Y22 0.9 0.5 0.7 0.4 Y13 0.2 0.4 0.6 0.2 Y24 0.9 0.5 0.2 0.7 Y1[[4]] Season Team W AHWO1 2017/2018 TeamA 2 1.752 2017/2018 TeamB 1 1.853 2017/2018 TeamC 1 1.704 2016/2017 TeamA 1 1.495 2016/2017 TeamB 3 1.516 2016/2017 TeamC 2 NA[[5]] A B CA 5 1 3B 2 5 3C 3 4 4[[6]] date Material Description1 10/04/2013 WM.5597394 PNEUMATIC2 11/07/2013 GB.D040790 RING3 08/06/2013 WM.4M01004A05 TOUCHEUR4 08/06/2013 WM.4M010108-1 LEVER
x3 is from here (will have to look at the edit history).
x4 is from here
x6 is from here