K&R Exercise 1-22. Fold (break) lines at specified column











up vote
1
down vote

favorite












Intro



I'm going through the K&R book (2nd edition, ANSI C ver.) and want to get the most from it: learn (outdated) C and practice problem-solving at the same time. I believe that the author's intention was to give the reader a good exercise, to make him think hard about what he can do with the tools introduced, so I'm sticking to program features introduced so far and using "future" features and standards only if they don't change the program logic.



Compiling with gcc -Wall -Wextra -Wconversion -pedantic -std=c99.



K&R Exercise 1-22



Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.



Solution



The solution attempts to reuse functions coded in the previous exercises (getline & copy) and make the solution reusable as well. In that spirit, a new function size_t foldline(char * restrict ins, char * restrict outs, size_t fcol, size_t tw); is coded to solve the problem. However, it requires a full buffer to be able to determine the break-point, so I coded size_t fillbuf(char s, size_t sz); to top-up the buffer.



I wanted to make the folding non-destructive and possibly reversible, so the program doesn't delete anything, and adds a when we break individual "words". The output can be reversed by deleting (?<= )n|(?<=t)n|\n pattern matches (obviously if original had some matches, they'll get deleted too). Would you say this design approach is good?



In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint? Or even, make one to find the breakpoint, and other to split the string?



## Code

/* Exercise 1-22. Write a program to "fold" long input lines into two or more
* shorter lines after the last non-blank character that occurs before the n-th
* column of input. Make sure your program does something intelligent with very
* long lines, and if there are no blanks or tabs before the specified column.
*/

#include <stdio.h>
#include <stdbool.h>

#define MAXTW 16 // max. tab width
#define MAXFC 100 // max. fold column, must be >=MAXTW
#define LINEBUF MAXFC+2 // line buffer size, must be >MAXFC+1

size_t getline(char line, size_t sz);
void copy(char * restrict to, char const * restrict from);
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw); // style Q, how to indent this best?
size_t fillbuf(char s, size_t sz);

int main(void)
{
char line[LINEBUF]; // input buffer
size_t len; // input buffer string length

size_t fcol = 10; // column to fold at
size_t tw = 4; // tab width

if (fcol > MAXFC) {
return -1;
}

if (tw > MAXTW) {
return -2;
}

len = getline(line, LINEBUF);
while (len > 0) {
char xline[LINEBUF]; // folded part
size_t xlen; // folded part string length

// fold the line (or part of one)
xlen = foldline(line, xline, fcol, tw);
printf("%s", line);

// did we fold?
if (xlen > 0) {
// we printed only the first part, and must run the 2nd part through
// the loop as well
copy(line, xline);
if (line[xlen-1] == 'n') {
len = xlen;
}
else {
// if there's no 'n' at the end, there's more of the line and
// we must fill the buffer to be able to process it properly
len = fillbuf(line, LINEBUF);
}
}
else {
len = getline(line, LINEBUF);
}
}
return 0;
}

/* Folds a line at the given column. The input string gets truncated to have
* `fcol` chars + 'n', and the excess goes into output string.
* Non-destructive (doesn't delete whitespace) and adds a '' char before the
* 'n' if it has to break a word. Can be reversed by deleting
* "(?<= )n|(?<=t)n|\n" regex pattern matches unless the original file had
* matches as well.
*/
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw)
{
/* Find i & col such that they will mark either the position of termination
* ( or n) or whatever the char in the overflow column.
* Find lnbi such that it will mark the last non-blank char before the
* folding column.
*/
size_t i;
size_t lnbi;
size_t col;
char lc = ' ';
for (col = 0, i = 0, lnbi = 0; ins[i] != '' && ins[i] != 'n' &&
col < fcol; ++i) {
if (ins[i] == ' ') {
++col;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else if (ins[i] == 't') {
col = (col + tw) / tw * tw;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else {
++col;
}
lc = ins[i];
}

// Determine where to fold at
size_t foldat;
if (col < fcol) {
// don't fold, terminated before the fold column
outs[0] = '';
return 0;
}
else if (col == fcol) {
// maybe fold, we have something in the overflow
if (ins[i] == 'n' || ins[i] == '') {
// don't fold, termination can stay in the overflow
outs[0] = '';
return 0;
}
else if (lnbi > 0 || (ins[0] != ' ' && ins[0] != 't' && (ins[1] == ' '
|| ins[1] == 't'))) {
// fold after the whitespace following the last non-blank char
foldat = lnbi+2;
}
else {
// fold at overflow
foldat = i;
}
}
else {
// col > fcol only possible if ins[i-1] == 't' so we fold and place the
// tab on the next line
foldat = i-1;
}

// Fold
size_t j = 0, k;
// add a marker if we're folding after a non-blank char
if (ins[foldat-1] != ' ' && ins[foldat-1] != 't') {
outs[j++] = ins[foldat-1];
ins[foldat-1] = '\';
}
for (k = foldat; ins[k] != ''; ++j, ++k) {
outs[j] = ins[k];
}
outs[j] = '';
ins[foldat++] = 'n';
ins[foldat] = '';
return j;
}

/* continue reading a line into `s`, return total string length;
* the buffer must have free space for at least 1 more char
*/
size_t fillbuf(char s, size_t sz)
{
// find end of string
size_t i;
for (i = 0; s[i] != ''; ++i) {
}

// not introduced in the book, but we could achieve the same by c&p
// getline code here
return i + getline(&s[i], sz-i);
}

/* getline: read a line into `s`, return string length;
* `sz` must be >1 to accomodate at least one character and string
* termination ''
*/
size_t getline(char s, size_t sz)
{
int c;
size_t i = 0;
bool el = false;
while (i + 1 < sz && !el) {
c = getchar();
if (c == EOF) {
el = true; // note: `break` not introduced yet
}
else {
s[i] = (char) c;
++i;
if (c == 'n') {
el = true;
}
}
}
if (i < sz) {
if (c == EOF && !feof(stdin)) { // EOF due to read error
i = 0;
}
s[i] = '';
}
return i;
}

/* copy: copy a '' terminated string `from` into `to`;
* assume `to` is big enough;
*/
void copy(char * restrict to, char const * restrict from)
{
size_t i;
for (i = 0; from[i] != ''; ++i) {
to[i] = from[i];
}
to[i] = '';
}


Testing



Output



$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c >out.txt



/* 
Exercise
1-22.
Write a
program
to "fold"
long
input
lines
into two
or more
*
shorter
lines
after the
last
non-blank
character
that
occurs
before
the n-th
* column
of input.
Make sure
your
program
does
something
intellige
nt with
very
* long
lines,
and if
there are
no blanks
or tabs
before
the
specified
column.
*/

#include
<stdio.h>
#include
<stdbool.
h>

#define
MAXTW
16
//
max. tab
width
#define
MAXFC
100
//
max. fold
column,


...



Reversibility



$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c | perl -p -e 's/(?<= )n|(?<=t)n|\n//g' | diff - ch1-ex-1-22-02.c


returns nothing :)










share|improve this question






















  • getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
    – chux
    15 hours ago










  • Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf gets an error and EOF, code continues, next getline reads something and program continues...)
    – div0man
    14 hours ago

















up vote
1
down vote

favorite












Intro



I'm going through the K&R book (2nd edition, ANSI C ver.) and want to get the most from it: learn (outdated) C and practice problem-solving at the same time. I believe that the author's intention was to give the reader a good exercise, to make him think hard about what he can do with the tools introduced, so I'm sticking to program features introduced so far and using "future" features and standards only if they don't change the program logic.



Compiling with gcc -Wall -Wextra -Wconversion -pedantic -std=c99.



K&R Exercise 1-22



Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.



Solution



The solution attempts to reuse functions coded in the previous exercises (getline & copy) and make the solution reusable as well. In that spirit, a new function size_t foldline(char * restrict ins, char * restrict outs, size_t fcol, size_t tw); is coded to solve the problem. However, it requires a full buffer to be able to determine the break-point, so I coded size_t fillbuf(char s, size_t sz); to top-up the buffer.



I wanted to make the folding non-destructive and possibly reversible, so the program doesn't delete anything, and adds a when we break individual "words". The output can be reversed by deleting (?<= )n|(?<=t)n|\n pattern matches (obviously if original had some matches, they'll get deleted too). Would you say this design approach is good?



In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint? Or even, make one to find the breakpoint, and other to split the string?



## Code

/* Exercise 1-22. Write a program to "fold" long input lines into two or more
* shorter lines after the last non-blank character that occurs before the n-th
* column of input. Make sure your program does something intelligent with very
* long lines, and if there are no blanks or tabs before the specified column.
*/

#include <stdio.h>
#include <stdbool.h>

#define MAXTW 16 // max. tab width
#define MAXFC 100 // max. fold column, must be >=MAXTW
#define LINEBUF MAXFC+2 // line buffer size, must be >MAXFC+1

size_t getline(char line, size_t sz);
void copy(char * restrict to, char const * restrict from);
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw); // style Q, how to indent this best?
size_t fillbuf(char s, size_t sz);

int main(void)
{
char line[LINEBUF]; // input buffer
size_t len; // input buffer string length

size_t fcol = 10; // column to fold at
size_t tw = 4; // tab width

if (fcol > MAXFC) {
return -1;
}

if (tw > MAXTW) {
return -2;
}

len = getline(line, LINEBUF);
while (len > 0) {
char xline[LINEBUF]; // folded part
size_t xlen; // folded part string length

// fold the line (or part of one)
xlen = foldline(line, xline, fcol, tw);
printf("%s", line);

// did we fold?
if (xlen > 0) {
// we printed only the first part, and must run the 2nd part through
// the loop as well
copy(line, xline);
if (line[xlen-1] == 'n') {
len = xlen;
}
else {
// if there's no 'n' at the end, there's more of the line and
// we must fill the buffer to be able to process it properly
len = fillbuf(line, LINEBUF);
}
}
else {
len = getline(line, LINEBUF);
}
}
return 0;
}

/* Folds a line at the given column. The input string gets truncated to have
* `fcol` chars + 'n', and the excess goes into output string.
* Non-destructive (doesn't delete whitespace) and adds a '' char before the
* 'n' if it has to break a word. Can be reversed by deleting
* "(?<= )n|(?<=t)n|\n" regex pattern matches unless the original file had
* matches as well.
*/
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw)
{
/* Find i & col such that they will mark either the position of termination
* ( or n) or whatever the char in the overflow column.
* Find lnbi such that it will mark the last non-blank char before the
* folding column.
*/
size_t i;
size_t lnbi;
size_t col;
char lc = ' ';
for (col = 0, i = 0, lnbi = 0; ins[i] != '' && ins[i] != 'n' &&
col < fcol; ++i) {
if (ins[i] == ' ') {
++col;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else if (ins[i] == 't') {
col = (col + tw) / tw * tw;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else {
++col;
}
lc = ins[i];
}

// Determine where to fold at
size_t foldat;
if (col < fcol) {
// don't fold, terminated before the fold column
outs[0] = '';
return 0;
}
else if (col == fcol) {
// maybe fold, we have something in the overflow
if (ins[i] == 'n' || ins[i] == '') {
// don't fold, termination can stay in the overflow
outs[0] = '';
return 0;
}
else if (lnbi > 0 || (ins[0] != ' ' && ins[0] != 't' && (ins[1] == ' '
|| ins[1] == 't'))) {
// fold after the whitespace following the last non-blank char
foldat = lnbi+2;
}
else {
// fold at overflow
foldat = i;
}
}
else {
// col > fcol only possible if ins[i-1] == 't' so we fold and place the
// tab on the next line
foldat = i-1;
}

// Fold
size_t j = 0, k;
// add a marker if we're folding after a non-blank char
if (ins[foldat-1] != ' ' && ins[foldat-1] != 't') {
outs[j++] = ins[foldat-1];
ins[foldat-1] = '\';
}
for (k = foldat; ins[k] != ''; ++j, ++k) {
outs[j] = ins[k];
}
outs[j] = '';
ins[foldat++] = 'n';
ins[foldat] = '';
return j;
}

/* continue reading a line into `s`, return total string length;
* the buffer must have free space for at least 1 more char
*/
size_t fillbuf(char s, size_t sz)
{
// find end of string
size_t i;
for (i = 0; s[i] != ''; ++i) {
}

// not introduced in the book, but we could achieve the same by c&p
// getline code here
return i + getline(&s[i], sz-i);
}

/* getline: read a line into `s`, return string length;
* `sz` must be >1 to accomodate at least one character and string
* termination ''
*/
size_t getline(char s, size_t sz)
{
int c;
size_t i = 0;
bool el = false;
while (i + 1 < sz && !el) {
c = getchar();
if (c == EOF) {
el = true; // note: `break` not introduced yet
}
else {
s[i] = (char) c;
++i;
if (c == 'n') {
el = true;
}
}
}
if (i < sz) {
if (c == EOF && !feof(stdin)) { // EOF due to read error
i = 0;
}
s[i] = '';
}
return i;
}

/* copy: copy a '' terminated string `from` into `to`;
* assume `to` is big enough;
*/
void copy(char * restrict to, char const * restrict from)
{
size_t i;
for (i = 0; from[i] != ''; ++i) {
to[i] = from[i];
}
to[i] = '';
}


Testing



Output



$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c >out.txt



/* 
Exercise
1-22.
Write a
program
to "fold"
long
input
lines
into two
or more
*
shorter
lines
after the
last
non-blank
character
that
occurs
before
the n-th
* column
of input.
Make sure
your
program
does
something
intellige
nt with
very
* long
lines,
and if
there are
no blanks
or tabs
before
the
specified
column.
*/

#include
<stdio.h>
#include
<stdbool.
h>

#define
MAXTW
16
//
max. tab
width
#define
MAXFC
100
//
max. fold
column,


...



Reversibility



$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c | perl -p -e 's/(?<= )n|(?<=t)n|\n//g' | diff - ch1-ex-1-22-02.c


returns nothing :)










share|improve this question






















  • getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
    – chux
    15 hours ago










  • Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf gets an error and EOF, code continues, next getline reads something and program continues...)
    – div0man
    14 hours ago















up vote
1
down vote

favorite









up vote
1
down vote

favorite











Intro



I'm going through the K&R book (2nd edition, ANSI C ver.) and want to get the most from it: learn (outdated) C and practice problem-solving at the same time. I believe that the author's intention was to give the reader a good exercise, to make him think hard about what he can do with the tools introduced, so I'm sticking to program features introduced so far and using "future" features and standards only if they don't change the program logic.



Compiling with gcc -Wall -Wextra -Wconversion -pedantic -std=c99.



K&R Exercise 1-22



Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.



Solution



The solution attempts to reuse functions coded in the previous exercises (getline & copy) and make the solution reusable as well. In that spirit, a new function size_t foldline(char * restrict ins, char * restrict outs, size_t fcol, size_t tw); is coded to solve the problem. However, it requires a full buffer to be able to determine the break-point, so I coded size_t fillbuf(char s, size_t sz); to top-up the buffer.



I wanted to make the folding non-destructive and possibly reversible, so the program doesn't delete anything, and adds a when we break individual "words". The output can be reversed by deleting (?<= )n|(?<=t)n|\n pattern matches (obviously if original had some matches, they'll get deleted too). Would you say this design approach is good?



In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint? Or even, make one to find the breakpoint, and other to split the string?



## Code

/* Exercise 1-22. Write a program to "fold" long input lines into two or more
* shorter lines after the last non-blank character that occurs before the n-th
* column of input. Make sure your program does something intelligent with very
* long lines, and if there are no blanks or tabs before the specified column.
*/

#include <stdio.h>
#include <stdbool.h>

#define MAXTW 16 // max. tab width
#define MAXFC 100 // max. fold column, must be >=MAXTW
#define LINEBUF MAXFC+2 // line buffer size, must be >MAXFC+1

size_t getline(char line, size_t sz);
void copy(char * restrict to, char const * restrict from);
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw); // style Q, how to indent this best?
size_t fillbuf(char s, size_t sz);

int main(void)
{
char line[LINEBUF]; // input buffer
size_t len; // input buffer string length

size_t fcol = 10; // column to fold at
size_t tw = 4; // tab width

if (fcol > MAXFC) {
return -1;
}

if (tw > MAXTW) {
return -2;
}

len = getline(line, LINEBUF);
while (len > 0) {
char xline[LINEBUF]; // folded part
size_t xlen; // folded part string length

// fold the line (or part of one)
xlen = foldline(line, xline, fcol, tw);
printf("%s", line);

// did we fold?
if (xlen > 0) {
// we printed only the first part, and must run the 2nd part through
// the loop as well
copy(line, xline);
if (line[xlen-1] == 'n') {
len = xlen;
}
else {
// if there's no 'n' at the end, there's more of the line and
// we must fill the buffer to be able to process it properly
len = fillbuf(line, LINEBUF);
}
}
else {
len = getline(line, LINEBUF);
}
}
return 0;
}

/* Folds a line at the given column. The input string gets truncated to have
* `fcol` chars + 'n', and the excess goes into output string.
* Non-destructive (doesn't delete whitespace) and adds a '' char before the
* 'n' if it has to break a word. Can be reversed by deleting
* "(?<= )n|(?<=t)n|\n" regex pattern matches unless the original file had
* matches as well.
*/
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw)
{
/* Find i & col such that they will mark either the position of termination
* ( or n) or whatever the char in the overflow column.
* Find lnbi such that it will mark the last non-blank char before the
* folding column.
*/
size_t i;
size_t lnbi;
size_t col;
char lc = ' ';
for (col = 0, i = 0, lnbi = 0; ins[i] != '' && ins[i] != 'n' &&
col < fcol; ++i) {
if (ins[i] == ' ') {
++col;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else if (ins[i] == 't') {
col = (col + tw) / tw * tw;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else {
++col;
}
lc = ins[i];
}

// Determine where to fold at
size_t foldat;
if (col < fcol) {
// don't fold, terminated before the fold column
outs[0] = '';
return 0;
}
else if (col == fcol) {
// maybe fold, we have something in the overflow
if (ins[i] == 'n' || ins[i] == '') {
// don't fold, termination can stay in the overflow
outs[0] = '';
return 0;
}
else if (lnbi > 0 || (ins[0] != ' ' && ins[0] != 't' && (ins[1] == ' '
|| ins[1] == 't'))) {
// fold after the whitespace following the last non-blank char
foldat = lnbi+2;
}
else {
// fold at overflow
foldat = i;
}
}
else {
// col > fcol only possible if ins[i-1] == 't' so we fold and place the
// tab on the next line
foldat = i-1;
}

// Fold
size_t j = 0, k;
// add a marker if we're folding after a non-blank char
if (ins[foldat-1] != ' ' && ins[foldat-1] != 't') {
outs[j++] = ins[foldat-1];
ins[foldat-1] = '\';
}
for (k = foldat; ins[k] != ''; ++j, ++k) {
outs[j] = ins[k];
}
outs[j] = '';
ins[foldat++] = 'n';
ins[foldat] = '';
return j;
}

/* continue reading a line into `s`, return total string length;
* the buffer must have free space for at least 1 more char
*/
size_t fillbuf(char s, size_t sz)
{
// find end of string
size_t i;
for (i = 0; s[i] != ''; ++i) {
}

// not introduced in the book, but we could achieve the same by c&p
// getline code here
return i + getline(&s[i], sz-i);
}

/* getline: read a line into `s`, return string length;
* `sz` must be >1 to accomodate at least one character and string
* termination ''
*/
size_t getline(char s, size_t sz)
{
int c;
size_t i = 0;
bool el = false;
while (i + 1 < sz && !el) {
c = getchar();
if (c == EOF) {
el = true; // note: `break` not introduced yet
}
else {
s[i] = (char) c;
++i;
if (c == 'n') {
el = true;
}
}
}
if (i < sz) {
if (c == EOF && !feof(stdin)) { // EOF due to read error
i = 0;
}
s[i] = '';
}
return i;
}

/* copy: copy a '' terminated string `from` into `to`;
* assume `to` is big enough;
*/
void copy(char * restrict to, char const * restrict from)
{
size_t i;
for (i = 0; from[i] != ''; ++i) {
to[i] = from[i];
}
to[i] = '';
}


Testing



Output



$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c >out.txt



/* 
Exercise
1-22.
Write a
program
to "fold"
long
input
lines
into two
or more
*
shorter
lines
after the
last
non-blank
character
that
occurs
before
the n-th
* column
of input.
Make sure
your
program
does
something
intellige
nt with
very
* long
lines,
and if
there are
no blanks
or tabs
before
the
specified
column.
*/

#include
<stdio.h>
#include
<stdbool.
h>

#define
MAXTW
16
//
max. tab
width
#define
MAXFC
100
//
max. fold
column,


...



Reversibility



$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c | perl -p -e 's/(?<= )n|(?<=t)n|\n//g' | diff - ch1-ex-1-22-02.c


returns nothing :)










share|improve this question













Intro



I'm going through the K&R book (2nd edition, ANSI C ver.) and want to get the most from it: learn (outdated) C and practice problem-solving at the same time. I believe that the author's intention was to give the reader a good exercise, to make him think hard about what he can do with the tools introduced, so I'm sticking to program features introduced so far and using "future" features and standards only if they don't change the program logic.



Compiling with gcc -Wall -Wextra -Wconversion -pedantic -std=c99.



K&R Exercise 1-22



Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.



Solution



The solution attempts to reuse functions coded in the previous exercises (getline & copy) and make the solution reusable as well. In that spirit, a new function size_t foldline(char * restrict ins, char * restrict outs, size_t fcol, size_t tw); is coded to solve the problem. However, it requires a full buffer to be able to determine the break-point, so I coded size_t fillbuf(char s, size_t sz); to top-up the buffer.



I wanted to make the folding non-destructive and possibly reversible, so the program doesn't delete anything, and adds a when we break individual "words". The output can be reversed by deleting (?<= )n|(?<=t)n|\n pattern matches (obviously if original had some matches, they'll get deleted too). Would you say this design approach is good?



In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint? Or even, make one to find the breakpoint, and other to split the string?



## Code

/* Exercise 1-22. Write a program to "fold" long input lines into two or more
* shorter lines after the last non-blank character that occurs before the n-th
* column of input. Make sure your program does something intelligent with very
* long lines, and if there are no blanks or tabs before the specified column.
*/

#include <stdio.h>
#include <stdbool.h>

#define MAXTW 16 // max. tab width
#define MAXFC 100 // max. fold column, must be >=MAXTW
#define LINEBUF MAXFC+2 // line buffer size, must be >MAXFC+1

size_t getline(char line, size_t sz);
void copy(char * restrict to, char const * restrict from);
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw); // style Q, how to indent this best?
size_t fillbuf(char s, size_t sz);

int main(void)
{
char line[LINEBUF]; // input buffer
size_t len; // input buffer string length

size_t fcol = 10; // column to fold at
size_t tw = 4; // tab width

if (fcol > MAXFC) {
return -1;
}

if (tw > MAXTW) {
return -2;
}

len = getline(line, LINEBUF);
while (len > 0) {
char xline[LINEBUF]; // folded part
size_t xlen; // folded part string length

// fold the line (or part of one)
xlen = foldline(line, xline, fcol, tw);
printf("%s", line);

// did we fold?
if (xlen > 0) {
// we printed only the first part, and must run the 2nd part through
// the loop as well
copy(line, xline);
if (line[xlen-1] == 'n') {
len = xlen;
}
else {
// if there's no 'n' at the end, there's more of the line and
// we must fill the buffer to be able to process it properly
len = fillbuf(line, LINEBUF);
}
}
else {
len = getline(line, LINEBUF);
}
}
return 0;
}

/* Folds a line at the given column. The input string gets truncated to have
* `fcol` chars + 'n', and the excess goes into output string.
* Non-destructive (doesn't delete whitespace) and adds a '' char before the
* 'n' if it has to break a word. Can be reversed by deleting
* "(?<= )n|(?<=t)n|\n" regex pattern matches unless the original file had
* matches as well.
*/
size_t foldline(char * restrict ins, char * restrict outs, size_t fcol,
size_t tw)
{
/* Find i & col such that they will mark either the position of termination
* ( or n) or whatever the char in the overflow column.
* Find lnbi such that it will mark the last non-blank char before the
* folding column.
*/
size_t i;
size_t lnbi;
size_t col;
char lc = ' ';
for (col = 0, i = 0, lnbi = 0; ins[i] != '' && ins[i] != 'n' &&
col < fcol; ++i) {
if (ins[i] == ' ') {
++col;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else if (ins[i] == 't') {
col = (col + tw) / tw * tw;
if (lc != ' ' && lc != 't') {
lnbi = i-1;
}
}
else {
++col;
}
lc = ins[i];
}

// Determine where to fold at
size_t foldat;
if (col < fcol) {
// don't fold, terminated before the fold column
outs[0] = '';
return 0;
}
else if (col == fcol) {
// maybe fold, we have something in the overflow
if (ins[i] == 'n' || ins[i] == '') {
// don't fold, termination can stay in the overflow
outs[0] = '';
return 0;
}
else if (lnbi > 0 || (ins[0] != ' ' && ins[0] != 't' && (ins[1] == ' '
|| ins[1] == 't'))) {
// fold after the whitespace following the last non-blank char
foldat = lnbi+2;
}
else {
// fold at overflow
foldat = i;
}
}
else {
// col > fcol only possible if ins[i-1] == 't' so we fold and place the
// tab on the next line
foldat = i-1;
}

// Fold
size_t j = 0, k;
// add a marker if we're folding after a non-blank char
if (ins[foldat-1] != ' ' && ins[foldat-1] != 't') {
outs[j++] = ins[foldat-1];
ins[foldat-1] = '\';
}
for (k = foldat; ins[k] != ''; ++j, ++k) {
outs[j] = ins[k];
}
outs[j] = '';
ins[foldat++] = 'n';
ins[foldat] = '';
return j;
}

/* continue reading a line into `s`, return total string length;
* the buffer must have free space for at least 1 more char
*/
size_t fillbuf(char s, size_t sz)
{
// find end of string
size_t i;
for (i = 0; s[i] != ''; ++i) {
}

// not introduced in the book, but we could achieve the same by c&p
// getline code here
return i + getline(&s[i], sz-i);
}

/* getline: read a line into `s`, return string length;
* `sz` must be >1 to accomodate at least one character and string
* termination ''
*/
size_t getline(char s, size_t sz)
{
int c;
size_t i = 0;
bool el = false;
while (i + 1 < sz && !el) {
c = getchar();
if (c == EOF) {
el = true; // note: `break` not introduced yet
}
else {
s[i] = (char) c;
++i;
if (c == 'n') {
el = true;
}
}
}
if (i < sz) {
if (c == EOF && !feof(stdin)) { // EOF due to read error
i = 0;
}
s[i] = '';
}
return i;
}

/* copy: copy a '' terminated string `from` into `to`;
* assume `to` is big enough;
*/
void copy(char * restrict to, char const * restrict from)
{
size_t i;
for (i = 0; from[i] != ''; ++i) {
to[i] = from[i];
}
to[i] = '';
}


Testing



Output



$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c >out.txt



/* 
Exercise
1-22.
Write a
program
to "fold"
long
input
lines
into two
or more
*
shorter
lines
after the
last
non-blank
character
that
occurs
before
the n-th
* column
of input.
Make sure
your
program
does
something
intellige
nt with
very
* long
lines,
and if
there are
no blanks
or tabs
before
the
specified
column.
*/

#include
<stdio.h>
#include
<stdbool.
h>

#define
MAXTW
16
//
max. tab
width
#define
MAXFC
100
//
max. fold
column,


...



Reversibility



$ ./ch1-ex-1-22-02 <ch1-ex-1-22-02.c | perl -p -e 's/(?<= )n|(?<=t)n|\n//g' | diff - ch1-ex-1-22-02.c


returns nothing :)







beginner c strings formatting io






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 2 days ago









div0man

2119




2119












  • getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
    – chux
    15 hours ago










  • Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf gets an error and EOF, code continues, next getline reads something and program continues...)
    – div0man
    14 hours ago




















  • getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
    – chux
    15 hours ago










  • Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf gets an error and EOF, code continues, next getline reads something and program continues...)
    – div0man
    14 hours ago


















getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
– chux
15 hours ago




getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF.
– chux
15 hours ago












Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf gets an error and EOF, code continues, next getline reads something and program continues...)
– div0man
14 hours ago






Got it. Now I'm thinking that the code would continue to run past a read error in some cases, too (fillbuf gets an error and EOF, code continues, next getline reads something and program continues...)
– div0man
14 hours ago












1 Answer
1






active

oldest

votes

















up vote
1
down vote













Only a small review.




Would you say this design approach is good?




Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.




In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?




Yes, moving that appending out of foldline() does make sense, yet "In the spirit of writing reusable code" I would move as much out of main() as reasonable too. Perhaps an intervening function?




Or even, make one to find the breakpoint, and other to split the string?




Yes, foldline() is lengthly and looses clarity with its length.





Minor stuff



Avoid order of precedence problems



Consider effect of bigline[LINEBUF * 2] does not double the size. Use () when a define has an expression.



// #define LINEBUF MAXFC+2 
#define LINEBUF (MAXFC+2)


Uninitialized object evaluation



getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change



// int c;
int c = 0;





share|improve this answer

















  • 1




    @div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
    – chux
    14 hours ago













Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207474%2fkr-exercise-1-22-fold-break-lines-at-specified-column%23new-answer', 'question_page');
}
);

Post as a guest
































1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote













Only a small review.




Would you say this design approach is good?




Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.




In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?




Yes, moving that appending out of foldline() does make sense, yet "In the spirit of writing reusable code" I would move as much out of main() as reasonable too. Perhaps an intervening function?




Or even, make one to find the breakpoint, and other to split the string?




Yes, foldline() is lengthly and looses clarity with its length.





Minor stuff



Avoid order of precedence problems



Consider effect of bigline[LINEBUF * 2] does not double the size. Use () when a define has an expression.



// #define LINEBUF MAXFC+2 
#define LINEBUF (MAXFC+2)


Uninitialized object evaluation



getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change



// int c;
int c = 0;





share|improve this answer

















  • 1




    @div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
    – chux
    14 hours ago

















up vote
1
down vote













Only a small review.




Would you say this design approach is good?




Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.




In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?




Yes, moving that appending out of foldline() does make sense, yet "In the spirit of writing reusable code" I would move as much out of main() as reasonable too. Perhaps an intervening function?




Or even, make one to find the breakpoint, and other to split the string?




Yes, foldline() is lengthly and looses clarity with its length.





Minor stuff



Avoid order of precedence problems



Consider effect of bigline[LINEBUF * 2] does not double the size. Use () when a define has an expression.



// #define LINEBUF MAXFC+2 
#define LINEBUF (MAXFC+2)


Uninitialized object evaluation



getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change



// int c;
int c = 0;





share|improve this answer

















  • 1




    @div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
    – chux
    14 hours ago















up vote
1
down vote










up vote
1
down vote









Only a small review.




Would you say this design approach is good?




Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.




In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?




Yes, moving that appending out of foldline() does make sense, yet "In the spirit of writing reusable code" I would move as much out of main() as reasonable too. Perhaps an intervening function?




Or even, make one to find the breakpoint, and other to split the string?




Yes, foldline() is lengthly and looses clarity with its length.





Minor stuff



Avoid order of precedence problems



Consider effect of bigline[LINEBUF * 2] does not double the size. Use () when a define has an expression.



// #define LINEBUF MAXFC+2 
#define LINEBUF (MAXFC+2)


Uninitialized object evaluation



getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change



// int c;
int c = 0;





share|improve this answer












Only a small review.




Would you say this design approach is good?




Yes. I did have trouble following the code though. I was not able to find a test that failed the coding goal.




In the spirit of writing reusable code, should I move appending the 'n' and '' to the main routine and make the function just split the string at breakpoint?




Yes, moving that appending out of foldline() does make sense, yet "In the spirit of writing reusable code" I would move as much out of main() as reasonable too. Perhaps an intervening function?




Or even, make one to find the breakpoint, and other to split the string?




Yes, foldline() is lengthly and looses clarity with its length.





Minor stuff



Avoid order of precedence problems



Consider effect of bigline[LINEBUF * 2] does not double the size. Use () when a define has an expression.



// #define LINEBUF MAXFC+2 
#define LINEBUF (MAXFC+2)


Uninitialized object evaluation



getline() UB in the pathological case sz == 1 as it tests uninitialized c with c == EOF. Simply change



// int c;
int c = 0;






share|improve this answer












share|improve this answer



share|improve this answer










answered 14 hours ago









chux

12.2k11342




12.2k11342








  • 1




    @div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
    – chux
    14 hours ago
















  • 1




    @div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
    – chux
    14 hours ago










1




1




@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago






@div0man I'd leave this question unaccepted for at least a number of days to encourage deeper answers.
– chux
14 hours ago




















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207474%2fkr-exercise-1-22-fold-break-lines-at-specified-column%23new-answer', 'question_page');
}
);

Post as a guest




















































































Popular posts from this blog

Список кардиналов, возведённых папой римским Каликстом III

Deduzione

Mysql.sock missing - “Can't connect to local MySQL server through socket”