K&R Exercise 1-22. Fold (break) lines at specified column
up vote
1
down vote
favorite
Intro
I'm going through the K&R book (2nd edition, ANSI C ver.) and want to get the most from it: learn (outdated) C and practice problem-solving at the same time. I believe that the author's intention was to give the reader a good exercise, to make him think hard about what he can do with the tools introduced, so I'm sticking to program features introduced so far and using "future" features and standards only if they don't change the program logic.
Compiling with gcc -Wall -Wextra -Wconversion -pedantic -std=c99
.
K&R Exercise 1-22
Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
Solution
The solution attempts to reuse functions coded in the previous exercises (getline
& copy
) and make the solution reusable as well. In that spirit, a new function size_t foldline(char * restrict ins, char * restrict outs, size_t fcol, size_t tw);
is coded to solve the problem. However, it requires a full buffer to be able to determine the break-point, so I coded size_t fillbuf(char s, size_t sz);
to top-up the buffer.
I wanted to make the folding non-destructive and possibly reversible, so the program doesn't delete anything, and adds a when we break individual "words". The output can be reversed by deleting
(?<= )n|(?<=t)n|\n
pattern matches (obviously if original had some matches, they'll get deleted too). Would you say this design approach is good?
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main
routine and make the function just split the string at breakpoint? Or even, make one to find the breakpoint, and other to split the string?
## Code
/* Exercise 1-22. Write a program to "fold" long input lines into two or more
* shorter lines after the last non-blank character that occurs before the n-th
* column of input. Make sure your program does something intelligent with very
* long lines, and if there are no blanks or tabs before the specified column.
*/
#include <stdio.h>
#include <stdbool.h>
#define MAXTW 16 // max. tab width
#define MAXFC 100 // max. fold column, must be >=MAXTW
#define LINEBUF MAXFC+2 // line buffer size, must be >MAXFC+1
size_t getline(char line, size_t sz);
void copy(char * restrict to, char const * restrict from);
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw); // style Q, how to indent this best?
size_t fillbuf(char s, size_t sz);
int main(void)
{
char line[LINEBUF]; // input buffer
size_t len; // input buffer string length
size_t fcol = 10; // column to fold at
size_t tw = 4; // tab width
if (fcol > MAXFC) {
return -1;
}
if (tw > MAXTW) {
return -2;
}
len = getline(line, LINEBUF);
while (len > 0) {
char xline[LINEBUF]; // folded part
size_t xlen; // folded part string length
// fold the line (or part of one)
xlen = foldline(line, xline, fcol, tw);
printf("%s", line);
// did we fold?
if (xlen > 0) {
// we printed only the first part, and must run the 2nd part through
// the loop as well
copy(line, xline);
if (line[xlen-1] == 'n') {
len = xlen;
}
else {
// if there's no 'n' at the end, there's more of the line and
// we must fill the buffer to be able to process it properly
len = fillbuf(line, LINEBUF);
}
}
else {
len = getline(line, LINEBUF);
}
}
return 0;
}
/* Folds a line at the given column. The input string gets truncated to have
* `fcol` chars + 'n', and the excess goes into output string.
* Non-destructive (doesn't delete whitespace) and adds a '' char before the
* 'n' if it has to break a word. Can be reversed by deleting
* "(?<= )n|(?<=t)n|\n" regex pattern matches unless the original file had
* matches as well.
*/
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw)
{
/* Find i & col such that they will mark either the position of termination
* ( or n) or whatever the char in the overflow column.
* Find lnbi such that it will mark the last non-blank char before the
* folding column.
*/
size_t i;
size_t lnbi;
size_t col;
char lc = ' ';
for (col = 0, i = 0, lnbi = 0; ins[i] != '' && ins[i] != 'n' &&
col < fcol; ++i) {
if (ins[i] == ' ') {
++col;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else if (ins[i] == 't') {
col = (col + tw) / tw * tw;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else {
++col;
}
lc = ins[i];
}
// Determine where to fold at
size_t foldat;
if (col < fcol) {
// don't fold, terminated before the fold column
outs[0] = '';
return 0;
}
else if (col == fcol) {
// maybe fold, we have something in the overflow
if (ins[i] == 'n' || ins[i] == '') {
// don't fold, termination can stay in the overflow
outs[0] = '';
return 0;
}
else if (lnbi > 0 || (ins[0] != ' ' && ins[0] != 't' && (ins[1] == ' '
|| ins[1] == 't'))) {
// fold after the whitespace following the last non-blank char
foldat = lnbi+2;
}
else {
// fold at overflow
foldat = i;
}
}
else {
// col > fcol only possible if ins[i-1] == 't' so we fold and place the
// tab on the next line
foldat = i-1;
}
// Fold
size_t j = 0, k;
// add a marker if we're folding after a non-blank char
if (ins[foldat-1] != ' ' && ins[foldat-1] != 't') {
outs[j++] = ins[foldat-1];
ins[foldat-1] = '\';
}
for (k = foldat; ins[k] != ''; ++j, ++k) {
outs[j] = ins[k];
}
outs[j] = '';
ins[foldat++] = 'n';
ins[foldat] = '';
return j;
}
/* continue reading a line into `s`, return total string length;
* the buffer must have free space for at least 1 more char
*/
size_t fillbuf(char s, size_t sz)
{
// find end of string
size_t i;
for (i = 0; s[i] != ''; ++i) {
}
// not introduced in the book, but we could achieve the same by c&p
// getline code here
return i + getline(&s[i], sz-i);
}
/* getline: read a line into `s`, return string length;
* `sz` must be >1 to accomodate at least one character and string
* termination ''
*/
size_t getline(char s, size_t sz)
{
int c;
size_t i = 0;
bool el = false;
while (i + 1 < sz && !el) {
c = getchar();
if (c == EOF) {
el = true; // note: `break` not introduced yet
}
else {
s[i] = (char) c;
++i;
if (c == 'n') {
el = true;
}
}
}
if (i < sz) {
if (c == EOF && !feof(stdin)) { // EOF due to read error
i = 0;
}
s[i] = '';
}
return i;
}
/* copy: copy a '' terminated string `from` into `to`;
* assume `to` is big enough;
*/
void copy(char * restrict to, char const * restrict from)
{
size_t i;
for (i = 0; from[i] != ''; ++i) {
to[i] = from[i];
}
to[i] = '';
}
Testing
Output
$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c >out.txt
/*
Exercise
1-22.
Write a
program
to "fold"
long
input
lines
into two
or more
*
shorter
lines
after the
last
non-blank
character
that
occurs
before
the n-th
* column
of input.
Make sure
your
program
does
something
intellige
nt with
very
* long
lines,
and if
there are
no blanks
or tabs
before
the
specified
column.
*/
#include
<stdio.h>
#include
<stdbool.
h>
#define
MAXTW
16
//
max. tab
width
#define
MAXFC
100
//
max. fold
column,
...
Reversibility
$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c | perl -p -e 's/(?<= )n|(?<=t)n|\n//g' | diff - ch1-ex-1-22-02.c
returns nothing :)
beginner c strings formatting io
add a comment |
up vote
1
down vote
favorite
Intro
I'm going through the K&R book (2nd edition, ANSI C ver.) and want to get the most from it: learn (outdated) C and practice problem-solving at the same time. I believe that the author's intention was to give the reader a good exercise, to make him think hard about what he can do with the tools introduced, so I'm sticking to program features introduced so far and using "future" features and standards only if they don't change the program logic.
Compiling with gcc -Wall -Wextra -Wconversion -pedantic -std=c99
.
K&R Exercise 1-22
Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
Solution
The solution attempts to reuse functions coded in the previous exercises (getline
& copy
) and make the solution reusable as well. In that spirit, a new function size_t foldline(char * restrict ins, char * restrict outs, size_t fcol, size_t tw);
is coded to solve the problem. However, it requires a full buffer to be able to determine the break-point, so I coded size_t fillbuf(char s, size_t sz);
to top-up the buffer.
I wanted to make the folding non-destructive and possibly reversible, so the program doesn't delete anything, and adds a when we break individual "words". The output can be reversed by deleting
(?<= )n|(?<=t)n|\n
pattern matches (obviously if original had some matches, they'll get deleted too). Would you say this design approach is good?
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main
routine and make the function just split the string at breakpoint? Or even, make one to find the breakpoint, and other to split the string?
## Code
/* Exercise 1-22. Write a program to "fold" long input lines into two or more
* shorter lines after the last non-blank character that occurs before the n-th
* column of input. Make sure your program does something intelligent with very
* long lines, and if there are no blanks or tabs before the specified column.
*/
#include <stdio.h>
#include <stdbool.h>
#define MAXTW 16 // max. tab width
#define MAXFC 100 // max. fold column, must be >=MAXTW
#define LINEBUF MAXFC+2 // line buffer size, must be >MAXFC+1
size_t getline(char line, size_t sz);
void copy(char * restrict to, char const * restrict from);
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw); // style Q, how to indent this best?
size_t fillbuf(char s, size_t sz);
int main(void)
{
char line[LINEBUF]; // input buffer
size_t len; // input buffer string length
size_t fcol = 10; // column to fold at
size_t tw = 4; // tab width
if (fcol > MAXFC) {
return -1;
}
if (tw > MAXTW) {
return -2;
}
len = getline(line, LINEBUF);
while (len > 0) {
char xline[LINEBUF]; // folded part
size_t xlen; // folded part string length
// fold the line (or part of one)
xlen = foldline(line, xline, fcol, tw);
printf("%s", line);
// did we fold?
if (xlen > 0) {
// we printed only the first part, and must run the 2nd part through
// the loop as well
copy(line, xline);
if (line[xlen-1] == 'n') {
len = xlen;
}
else {
// if there's no 'n' at the end, there's more of the line and
// we must fill the buffer to be able to process it properly
len = fillbuf(line, LINEBUF);
}
}
else {
len = getline(line, LINEBUF);
}
}
return 0;
}
/* Folds a line at the given column. The input string gets truncated to have
* `fcol` chars + 'n', and the excess goes into output string.
* Non-destructive (doesn't delete whitespace) and adds a '' char before the
* 'n' if it has to break a word. Can be reversed by deleting
* "(?<= )n|(?<=t)n|\n" regex pattern matches unless the original file had
* matches as well.
*/
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw)
{
/* Find i & col such that they will mark either the position of termination
* ( or n) or whatever the char in the overflow column.
* Find lnbi such that it will mark the last non-blank char before the
* folding column.
*/
size_t i;
size_t lnbi;
size_t col;
char lc = ' ';
for (col = 0, i = 0, lnbi = 0; ins[i] != '' && ins[i] != 'n' &&
col < fcol; ++i) {
if (ins[i] == ' ') {
++col;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else if (ins[i] == 't') {
col = (col + tw) / tw * tw;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else {
++col;
}
lc = ins[i];
}
// Determine where to fold at
size_t foldat;
if (col < fcol) {
// don't fold, terminated before the fold column
outs[0] = '';
return 0;
}
else if (col == fcol) {
// maybe fold, we have something in the overflow
if (ins[i] == 'n' || ins[i] == '') {
// don't fold, termination can stay in the overflow
outs[0] = '';
return 0;
}
else if (lnbi > 0 || (ins[0] != ' ' && ins[0] != 't' && (ins[1] == ' '
|| ins[1] == 't'))) {
// fold after the whitespace following the last non-blank char
foldat = lnbi+2;
}
else {
// fold at overflow
foldat = i;
}
}
else {
// col > fcol only possible if ins[i-1] == 't' so we fold and place the
// tab on the next line
foldat = i-1;
}
// Fold
size_t j = 0, k;
// add a marker if we're folding after a non-blank char
if (ins[foldat-1] != ' ' && ins[foldat-1] != 't') {
outs[j++] = ins[foldat-1];
ins[foldat-1] = '\';
}
for (k = foldat; ins[k] != ''; ++j, ++k) {
outs[j] = ins[k];
}
outs[j] = '';
ins[foldat++] = 'n';
ins[foldat] = '';
return j;
}
/* continue reading a line into `s`, return total string length;
* the buffer must have free space for at least 1 more char
*/
size_t fillbuf(char s, size_t sz)
{
// find end of string
size_t i;
for (i = 0; s[i] != ''; ++i) {
}
// not introduced in the book, but we could achieve the same by c&p
// getline code here
return i + getline(&s[i], sz-i);
}
/* getline: read a line into `s`, return string length;
* `sz` must be >1 to accomodate at least one character and string
* termination ''
*/
size_t getline(char s, size_t sz)
{
int c;
size_t i = 0;
bool el = false;
while (i + 1 < sz && !el) {
c = getchar();
if (c == EOF) {
el = true; // note: `break` not introduced yet
}
else {
s[i] = (char) c;
++i;
if (c == 'n') {
el = true;
}
}
}
if (i < sz) {
if (c == EOF && !feof(stdin)) { // EOF due to read error
i = 0;
}
s[i] = '';
}
return i;
}
/* copy: copy a '' terminated string `from` into `to`;
* assume `to` is big enough;
*/
void copy(char * restrict to, char const * restrict from)
{
size_t i;
for (i = 0; from[i] != ''; ++i) {
to[i] = from[i];
}
to[i] = '';
}
Testing
Output
$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c >out.txt
/*
Exercise
1-22.
Write a
program
to "fold"
long
input
lines
into two
or more
*
shorter
lines
after the
last
non-blank
character
that
occurs
before
the n-th
* column
of input.
Make sure
your
program
does
something
intellige
nt with
very
* long
lines,
and if
there are
no blanks
or tabs
before
the
specified
column.
*/
#include
<stdio.h>
#include
<stdbool.
h>
#define
MAXTW
16
//
max. tab
width
#define
MAXFC
100
//
max. fold
column,
...
Reversibility
$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c | perl -p -e 's/(?<= )n|(?<=t)n|\n//g' | diff - ch1-ex-1-22-02.c
returns nothing :)
beginner c strings formatting io
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
– chux
15 hours ago
Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf
gets an error and EOF, code continues, nextgetline
reads something and program continues...)
– div0man
14 hours ago
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Intro
I'm going through the K&R book (2nd edition, ANSI C ver.) and want to get the most from it: learn (outdated) C and practice problem-solving at the same time. I believe that the author's intention was to give the reader a good exercise, to make him think hard about what he can do with the tools introduced, so I'm sticking to program features introduced so far and using "future" features and standards only if they don't change the program logic.
Compiling with gcc -Wall -Wextra -Wconversion -pedantic -std=c99
.
K&R Exercise 1-22
Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
Solution
The solution attempts to reuse functions coded in the previous exercises (getline
& copy
) and make the solution reusable as well. In that spirit, a new function size_t foldline(char * restrict ins, char * restrict outs, size_t fcol, size_t tw);
is coded to solve the problem. However, it requires a full buffer to be able to determine the break-point, so I coded size_t fillbuf(char s, size_t sz);
to top-up the buffer.
I wanted to make the folding non-destructive and possibly reversible, so the program doesn't delete anything, and adds a when we break individual "words". The output can be reversed by deleting
(?<= )n|(?<=t)n|\n
pattern matches (obviously if original had some matches, they'll get deleted too). Would you say this design approach is good?
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main
routine and make the function just split the string at breakpoint? Or even, make one to find the breakpoint, and other to split the string?
## Code
/* Exercise 1-22. Write a program to "fold" long input lines into two or more
* shorter lines after the last non-blank character that occurs before the n-th
* column of input. Make sure your program does something intelligent with very
* long lines, and if there are no blanks or tabs before the specified column.
*/
#include <stdio.h>
#include <stdbool.h>
#define MAXTW 16 // max. tab width
#define MAXFC 100 // max. fold column, must be >=MAXTW
#define LINEBUF MAXFC+2 // line buffer size, must be >MAXFC+1
size_t getline(char line, size_t sz);
void copy(char * restrict to, char const * restrict from);
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw); // style Q, how to indent this best?
size_t fillbuf(char s, size_t sz);
int main(void)
{
char line[LINEBUF]; // input buffer
size_t len; // input buffer string length
size_t fcol = 10; // column to fold at
size_t tw = 4; // tab width
if (fcol > MAXFC) {
return -1;
}
if (tw > MAXTW) {
return -2;
}
len = getline(line, LINEBUF);
while (len > 0) {
char xline[LINEBUF]; // folded part
size_t xlen; // folded part string length
// fold the line (or part of one)
xlen = foldline(line, xline, fcol, tw);
printf("%s", line);
// did we fold?
if (xlen > 0) {
// we printed only the first part, and must run the 2nd part through
// the loop as well
copy(line, xline);
if (line[xlen-1] == 'n') {
len = xlen;
}
else {
// if there's no 'n' at the end, there's more of the line and
// we must fill the buffer to be able to process it properly
len = fillbuf(line, LINEBUF);
}
}
else {
len = getline(line, LINEBUF);
}
}
return 0;
}
/* Folds a line at the given column. The input string gets truncated to have
* `fcol` chars + 'n', and the excess goes into output string.
* Non-destructive (doesn't delete whitespace) and adds a '' char before the
* 'n' if it has to break a word. Can be reversed by deleting
* "(?<= )n|(?<=t)n|\n" regex pattern matches unless the original file had
* matches as well.
*/
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw)
{
/* Find i & col such that they will mark either the position of termination
* ( or n) or whatever the char in the overflow column.
* Find lnbi such that it will mark the last non-blank char before the
* folding column.
*/
size_t i;
size_t lnbi;
size_t col;
char lc = ' ';
for (col = 0, i = 0, lnbi = 0; ins[i] != '' && ins[i] != 'n' &&
col < fcol; ++i) {
if (ins[i] == ' ') {
++col;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else if (ins[i] == 't') {
col = (col + tw) / tw * tw;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else {
++col;
}
lc = ins[i];
}
// Determine where to fold at
size_t foldat;
if (col < fcol) {
// don't fold, terminated before the fold column
outs[0] = '';
return 0;
}
else if (col == fcol) {
// maybe fold, we have something in the overflow
if (ins[i] == 'n' || ins[i] == '') {
// don't fold, termination can stay in the overflow
outs[0] = '';
return 0;
}
else if (lnbi > 0 || (ins[0] != ' ' && ins[0] != 't' && (ins[1] == ' '
|| ins[1] == 't'))) {
// fold after the whitespace following the last non-blank char
foldat = lnbi+2;
}
else {
// fold at overflow
foldat = i;
}
}
else {
// col > fcol only possible if ins[i-1] == 't' so we fold and place the
// tab on the next line
foldat = i-1;
}
// Fold
size_t j = 0, k;
// add a marker if we're folding after a non-blank char
if (ins[foldat-1] != ' ' && ins[foldat-1] != 't') {
outs[j++] = ins[foldat-1];
ins[foldat-1] = '\';
}
for (k = foldat; ins[k] != ''; ++j, ++k) {
outs[j] = ins[k];
}
outs[j] = '';
ins[foldat++] = 'n';
ins[foldat] = '';
return j;
}
/* continue reading a line into `s`, return total string length;
* the buffer must have free space for at least 1 more char
*/
size_t fillbuf(char s, size_t sz)
{
// find end of string
size_t i;
for (i = 0; s[i] != ''; ++i) {
}
// not introduced in the book, but we could achieve the same by c&p
// getline code here
return i + getline(&s[i], sz-i);
}
/* getline: read a line into `s`, return string length;
* `sz` must be >1 to accomodate at least one character and string
* termination ''
*/
size_t getline(char s, size_t sz)
{
int c;
size_t i = 0;
bool el = false;
while (i + 1 < sz && !el) {
c = getchar();
if (c == EOF) {
el = true; // note: `break` not introduced yet
}
else {
s[i] = (char) c;
++i;
if (c == 'n') {
el = true;
}
}
}
if (i < sz) {
if (c == EOF && !feof(stdin)) { // EOF due to read error
i = 0;
}
s[i] = '';
}
return i;
}
/* copy: copy a '' terminated string `from` into `to`;
* assume `to` is big enough;
*/
void copy(char * restrict to, char const * restrict from)
{
size_t i;
for (i = 0; from[i] != ''; ++i) {
to[i] = from[i];
}
to[i] = '';
}
Testing
Output
$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c >out.txt
/*
Exercise
1-22.
Write a
program
to "fold"
long
input
lines
into two
or more
*
shorter
lines
after the
last
non-blank
character
that
occurs
before
the n-th
* column
of input.
Make sure
your
program
does
something
intellige
nt with
very
* long
lines,
and if
there are
no blanks
or tabs
before
the
specified
column.
*/
#include
<stdio.h>
#include
<stdbool.
h>
#define
MAXTW
16
//
max. tab
width
#define
MAXFC
100
//
max. fold
column,
...
Reversibility
$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c | perl -p -e 's/(?<= )n|(?<=t)n|\n//g' | diff - ch1-ex-1-22-02.c
returns nothing :)
beginner c strings formatting io
Intro
I'm going through the K&R book (2nd edition, ANSI C ver.) and want to get the most from it: learn (outdated) C and practice problem-solving at the same time. I believe that the author's intention was to give the reader a good exercise, to make him think hard about what he can do with the tools introduced, so I'm sticking to program features introduced so far and using "future" features and standards only if they don't change the program logic.
Compiling with gcc -Wall -Wextra -Wconversion -pedantic -std=c99
.
K&R Exercise 1-22
Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
Solution
The solution attempts to reuse functions coded in the previous exercises (getline
& copy
) and make the solution reusable as well. In that spirit, a new function size_t foldline(char * restrict ins, char * restrict outs, size_t fcol, size_t tw);
is coded to solve the problem. However, it requires a full buffer to be able to determine the break-point, so I coded size_t fillbuf(char s, size_t sz);
to top-up the buffer.
I wanted to make the folding non-destructive and possibly reversible, so the program doesn't delete anything, and adds a when we break individual "words". The output can be reversed by deleting
(?<= )n|(?<=t)n|\n
pattern matches (obviously if original had some matches, they'll get deleted too). Would you say this design approach is good?
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main
routine and make the function just split the string at breakpoint? Or even, make one to find the breakpoint, and other to split the string?
## Code
/* Exercise 1-22. Write a program to "fold" long input lines into two or more
* shorter lines after the last non-blank character that occurs before the n-th
* column of input. Make sure your program does something intelligent with very
* long lines, and if there are no blanks or tabs before the specified column.
*/
#include <stdio.h>
#include <stdbool.h>
#define MAXTW 16 // max. tab width
#define MAXFC 100 // max. fold column, must be >=MAXTW
#define LINEBUF MAXFC+2 // line buffer size, must be >MAXFC+1
size_t getline(char line, size_t sz);
void copy(char * restrict to, char const * restrict from);
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw); // style Q, how to indent this best?
size_t fillbuf(char s, size_t sz);
int main(void)
{
char line[LINEBUF]; // input buffer
size_t len; // input buffer string length
size_t fcol = 10; // column to fold at
size_t tw = 4; // tab width
if (fcol > MAXFC) {
return -1;
}
if (tw > MAXTW) {
return -2;
}
len = getline(line, LINEBUF);
while (len > 0) {
char xline[LINEBUF]; // folded part
size_t xlen; // folded part string length
// fold the line (or part of one)
xlen = foldline(line, xline, fcol, tw);
printf("%s", line);
// did we fold?
if (xlen > 0) {
// we printed only the first part, and must run the 2nd part through
// the loop as well
copy(line, xline);
if (line[xlen-1] == 'n') {
len = xlen;
}
else {
// if there's no 'n' at the end, there's more of the line and
// we must fill the buffer to be able to process it properly
len = fillbuf(line, LINEBUF);
}
}
else {
len = getline(line, LINEBUF);
}
}
return 0;
}
/* Folds a line at the given column. The input string gets truncated to have
* `fcol` chars + 'n', and the excess goes into output string.
* Non-destructive (doesn't delete whitespace) and adds a '' char before the
* 'n' if it has to break a word. Can be reversed by deleting
* "(?<= )n|(?<=t)n|\n" regex pattern matches unless the original file had
* matches as well.
*/
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw)
{
/* Find i & col such that they will mark either the position of termination
* ( or n) or whatever the char in the overflow column.
* Find lnbi such that it will mark the last non-blank char before the
* folding column.
*/
size_t i;
size_t lnbi;
size_t col;
char lc = ' ';
for (col = 0, i = 0, lnbi = 0; ins[i] != '' && ins[i] != 'n' &&
col < fcol; ++i) {
if (ins[i] == ' ') {
++col;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else if (ins[i] == 't') {
col = (col + tw) / tw * tw;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else {
++col;
}
lc = ins[i];
}
// Determine where to fold at
size_t foldat;
if (col < fcol) {
// don't fold, terminated before the fold column
outs[0] = '';
return 0;
}
else if (col == fcol) {
// maybe fold, we have something in the overflow
if (ins[i] == 'n' || ins[i] == '') {
// don't fold, termination can stay in the overflow
outs[0] = '';
return 0;
}
else if (lnbi > 0 || (ins[0] != ' ' && ins[0] != 't' && (ins[1] == ' '
|| ins[1] == 't'))) {
// fold after the whitespace following the last non-blank char
foldat = lnbi+2;
}
else {
// fold at overflow
foldat = i;
}
}
else {
// col > fcol only possible if ins[i-1] == 't' so we fold and place the
// tab on the next line
foldat = i-1;
}
// Fold
size_t j = 0, k;
// add a marker if we're folding after a non-blank char
if (ins[foldat-1] != ' ' && ins[foldat-1] != 't') {
outs[j++] = ins[foldat-1];
ins[foldat-1] = '\';
}
for (k = foldat; ins[k] != ''; ++j, ++k) {
outs[j] = ins[k];
}
outs[j] = '';
ins[foldat++] = 'n';
ins[foldat] = '';
return j;
}
/* continue reading a line into `s`, return total string length;
* the buffer must have free space for at least 1 more char
*/
size_t fillbuf(char s, size_t sz)
{
// find end of string
size_t i;
for (i = 0; s[i] != ''; ++i) {
}
// not introduced in the book, but we could achieve the same by c&p
// getline code here
return i + getline(&s[i], sz-i);
}
/* getline: read a line into `s`, return string length;
* `sz` must be >1 to accomodate at least one character and string
* termination ''
*/
size_t getline(char s, size_t sz)
{
int c;
size_t i = 0;
bool el = false;
while (i + 1 < sz && !el) {
c = getchar();
if (c == EOF) {
el = true; // note: `break` not introduced yet
}
else {
s[i] = (char) c;
++i;
if (c == 'n') {
el = true;
}
}
}
if (i < sz) {
if (c == EOF && !feof(stdin)) { // EOF due to read error
i = 0;
}
s[i] = '';
}
return i;
}
/* copy: copy a '' terminated string `from` into `to`;
* assume `to` is big enough;
*/
void copy(char * restrict to, char const * restrict from)
{
size_t i;
for (i = 0; from[i] != ''; ++i) {
to[i] = from[i];
}
to[i] = '';
}
Testing
Output
$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c >out.txt
/*
Exercise
1-22.
Write a
program
to "fold"
long
input
lines
into two
or more
*
shorter
lines
after the
last
non-blank
character
that
occurs
before
the n-th
* column
of input.
Make sure
your
program
does
something
intellige
nt with
very
* long
lines,
and if
there are
no blanks
or tabs
before
the
specified
column.
*/
#include
<stdio.h>
#include
<stdbool.
h>
#define
MAXTW
16
//
max. tab
width
#define
MAXFC
100
//
max. fold
column,
...
Reversibility
$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c | perl -p -e 's/(?<= )n|(?<=t)n|\n//g' | diff - ch1-ex-1-22-02.c
returns nothing :)
beginner c strings formatting io
beginner c strings formatting io
asked 2 days ago
div0man
2119
2119
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
– chux
15 hours ago
Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf
gets an error and EOF, code continues, nextgetline
reads something and program continues...)
– div0man
14 hours ago
add a comment |
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
– chux
15 hours ago
Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf
gets an error and EOF, code continues, nextgetline
reads something and program continues...)
– div0man
14 hours ago
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.– chux
15 hours ago
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.– chux
15 hours ago
Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (
fillbuf
gets an error and EOF, code continues, next getline
reads something and program continues...)– div0man
14 hours ago
Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (
fillbuf
gets an error and EOF, code continues, next getline
reads something and program continues...)– div0man
14 hours ago
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
Only a small review.
Would you say this design approach is good?
Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?
Yes, moving that appending out of foldline()
does make sense, yet "In the spirit of writing reusable code" I would move as much out of main()
as reasonable too. Perhaps an intervening function?
Or even, make one to find the breakpoint, and other to split the string?
Yes, foldline()
is lengthly and looses clarity with its length.
Minor stuff
Avoid order of precedence problems
Consider effect of bigline[LINEBUF * 2]
does not double the size. Use ()
when a define
has an expression.
// #define LINEBUF MAXFC+2
#define LINEBUF (MAXFC+2)
Uninitialized object evaluation
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change
// int c;
int c = 0;
1
@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
Only a small review.
Would you say this design approach is good?
Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?
Yes, moving that appending out of foldline()
does make sense, yet "In the spirit of writing reusable code" I would move as much out of main()
as reasonable too. Perhaps an intervening function?
Or even, make one to find the breakpoint, and other to split the string?
Yes, foldline()
is lengthly and looses clarity with its length.
Minor stuff
Avoid order of precedence problems
Consider effect of bigline[LINEBUF * 2]
does not double the size. Use ()
when a define
has an expression.
// #define LINEBUF MAXFC+2
#define LINEBUF (MAXFC+2)
Uninitialized object evaluation
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change
// int c;
int c = 0;
1
@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago
add a comment |
up vote
1
down vote
Only a small review.
Would you say this design approach is good?
Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?
Yes, moving that appending out of foldline()
does make sense, yet "In the spirit of writing reusable code" I would move as much out of main()
as reasonable too. Perhaps an intervening function?
Or even, make one to find the breakpoint, and other to split the string?
Yes, foldline()
is lengthly and looses clarity with its length.
Minor stuff
Avoid order of precedence problems
Consider effect of bigline[LINEBUF * 2]
does not double the size. Use ()
when a define
has an expression.
// #define LINEBUF MAXFC+2
#define LINEBUF (MAXFC+2)
Uninitialized object evaluation
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change
// int c;
int c = 0;
1
@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago
add a comment |
up vote
1
down vote
up vote
1
down vote
Only a small review.
Would you say this design approach is good?
Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?
Yes, moving that appending out of foldline()
does make sense, yet "In the spirit of writing reusable code" I would move as much out of main()
as reasonable too. Perhaps an intervening function?
Or even, make one to find the breakpoint, and other to split the string?
Yes, foldline()
is lengthly and looses clarity with its length.
Minor stuff
Avoid order of precedence problems
Consider effect of bigline[LINEBUF * 2]
does not double the size. Use ()
when a define
has an expression.
// #define LINEBUF MAXFC+2
#define LINEBUF (MAXFC+2)
Uninitialized object evaluation
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change
// int c;
int c = 0;
Only a small review.
Would you say this design approach is good?
Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.
In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?
Yes, moving that appending out of foldline()
does make sense, yet "In the spirit of writing reusable code" I would move as much out of main()
as reasonable too. Perhaps an intervening function?
Or even, make one to find the breakpoint, and other to split the string?
Yes, foldline()
is lengthly and looses clarity with its length.
Minor stuff
Avoid order of precedence problems
Consider effect of bigline[LINEBUF * 2]
does not double the size. Use ()
when a define
has an expression.
// #define LINEBUF MAXFC+2
#define LINEBUF (MAXFC+2)
Uninitialized object evaluation
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change
// int c;
int c = 0;
answered 14 hours ago
chux
12.2k11342
12.2k11342
1
@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago
add a comment |
1
@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago
1
1
@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago
@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207474%2fkr-exercise-1-22-fold-break-lines-at-specified-column%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
getline()
UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.– chux
15 hours ago
Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (
fillbuf
gets an error and EOF, code continues, nextgetline
reads something and program continues...)– div0man
14 hours ago